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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along wth uses for these polynudeotides and pro;teins,f^^ 

thei^)eatic, diagnostic and research me&ods. 

2. BACKGROUND 

Tedihology aimed at the discovery of protein fectors (including e.g., cytokines, such as 
10 lyn5)holdnes,interfen)ns,CSFs,chemoldnes.andinterleukins)hasina^ 
decade. Tlie now routine hybridization cloning and expression clonic 
polynucleotides "directly" in the sense that they rely on infimnation directly rehited to the 
discovered protein (i.e., partial DNA/aniino acid sequence of the protein in the case of 
hybridization cloning; activity of the protem m the case of expression clonmg). More recent 
15 "indirect" cloning techniques such as signal sequence clonmg. ^di isolates DNA sequences 
based on the presende of a nov/ weU-recognized secretory leader sequence motif, as weU as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced fhe 
. stateoftheartbymakingavailablelargenumbersofDNA/anrinoacidseque^ 

that are known to have biological activity, for example, by virtue of thdr secreted nature in the 
20 caseofleadersequencecloning.by virtueoftheircellortissuesouroeinlhecaseofPCR-based 

techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous ^pUcations in. for 

cxan9»le, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 
25 and laoducts dependaat on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositionsof Represent invention indudenovdisolatedpolypeptides. novel 

isok1fidpolynucleotidcsencodingsudipolypqjtides,includingre«^^ 
3 0 cloned genes or degenerate variants thereof especiaUy naturally occurring variants such as aUelic 

variants,antisensepolynudeotidemolecules.andantibodiesthatq)edficallyrec^ 

epitopespresentQnsudipolypeptides,aswdlashyhridomaspr6dncingsudianlibo& 

The compositionsof the present inventionadditionaUyincludevedors.indudinge3qp^ 

vectors, containing the polynucleotides of the invention, cells genetically engineered to contain sodi 
35 polynucleotidesand cells genetically engmeered to express sudipdynudeotides. 
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The present inventionrelates to a coUectionor libiaiy of at least one novd nucleic arid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencingby 
hybridization(SBH), and in some cases, sequences obtained fiom one or more public databases. 
The inventionrelates also to the proteins encodedby suchpolynucleotides,along wiAthei^wutic 
5 diagnostic and research utilities forthesepolynucleotidesand proteins. TTiese nucleic add 
sequencesaredesignatedasSEQroNO: 1-984, 1969-2952.3937-3942or 3949-3954. -Die 
polypeptidessequences are designated SEQ ID NO: 985-1968,2953-3936,3943-3948 or 3955- 
3960. The nucleic adds and polypeptides are provided in Ihe Sequence Listing. Inthe.nucleic adds 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
10 isanyoftheWbases. IntheaininoaddsprovidedintheSequenceListing,*correq)on 

stopcodoiL 

The nucldc add sequences of flie presort invention also include, nuddc acid sequences that 
hybridizetotiieconqdementofSEQIDNO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringenthybridizationconditions;nucleic acid sequences vWdi are aUeUc variants or species 
15 homologues of any of tiie nucldc add sequences recited above, or nucldc add sequencesthat 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-29.52, 3937-3942 or 3949-3954. A polynucleotide compridng a nucleotide 
sequencehavingatleast90%identitytomidentifyingsequenceofSEQIDNO: 1-984,1969-2952, 
3937-3942 or 3949-3954 or a degeneratevariant or fiagmentthereof. The identifying sequence can 

20 be 100 base pairs in lengtii. 

The nucleic acid sequences of «ie present invention also indudethe sequence infiarmation 
fromthenucldcacidsequencesof SEQIDNO:l-984, 1969-2952, 3937-3942 or 3949-3954. Hie 
sequence infonnationcanbeasegment of any one ofSEQ ID NO:l-984, 1969-2952, 3937-3942or 
3949-3954 tiiat uniquely identifies or represents the sequence infornmtion of SEQ ID NO:l-984, 

25 1969-2952, 3937-3942 or 3949-3954. 

A collection as used in tiiis ^plication can be a collection of only one polynudeotide. The 
coUectionof sequence infomiationoridenti^g information of each sequence can be provided on 
anudeicaddan^. In one embodiment, segments of sequence information is provided on a 
nucldc add array to detectthe polynucleotidelhat contains the segment The array can be designed 

30 to detectfiill-matchor mismatchto tije polynudeotidethatcontainstiie segment The collection 
can also be provided in a compitCT-readablefomiaL 

His invention also inchides the reverse or direct conqplement of any of the nuddc arid 
sequences redted above; doning or ejqaession vectors containing the nucldc acid sequences; and 
host cells or organisms transformed wifliftese expression vectors. Nucldc add sequences (or Iheir 

35 reverseordiiectconq)lemettts)accordingtotiieinventionhavenumen>usapplicationsinavari 
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of techniqueslaiowntofhose ddUedmfte art of molecolibiology.sudi as use as hybridization 
probes,use asprimersforPCR, use in ananay.useincomi»iter-ieadablemedia,use in sequendng 
fiiU-length genes, use for chromosome and generating, use in tberecombinantproductionof 
protein, and use in the generation of anti-sense DN A or RNA, ttidr diemical analogs and the like. 

5 In apreferredembodiment,&enucleicadd sequencesof SEQIDNO:l-984, 1969-2952, 

3937-3942 or 3949-3954 or novel segments or parts of flie nucleic acids of the invention are used as 
primers in expression assays that are weU known in the art In a particularly prefeired embodiment, 
thenucleic acid sequences of SEQ ID NO:l-984,1969-2952,3937.3942or3949-3954ornovel 
segments or parts of the nucleic acids provided herein are used in diagnostics for identifying 

10 expressedgenes or, as wefl baown in the art and exemplified by VoDrath et al., Science 258:52-59 
(1 992), as expressed sequence tags for physical mai>ping of the human genome. 

The isolated polynucleotidesof the invention include, but are not limited to, a 
polynudeotidecomprising any one of the nucleotide sequences set forth in SEQ ID NO:l-984, 

1969^2952, 3937-3942 or 3949-3954 ; a polynucleotide comprising any of the foil length protein 
15 codingsequencesof SEQIDNO:l-984, 1969-2952,3937-3942 or 394^3954; and a polynucleotide 
comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID 
NO:l-984, 1969-2952,3937-^3942 or 3949-3954. The polynucleotidesof the presentinvention also 
include,butarenotlimitedto,apolynucleotide1hathybridizesunderstri^ 
conditions to(a)theconiplementof any one of the nucleotide sequences set forth in SEQ IDNO:l- 

20 984, 1969-2952,3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forthin the Sequence Listing; (c) a polynucleotide which is an alleHc 
variantofanypolynucleotidesrecitedabove;(d)apolynucleotidei^chencodesaq^ 
(e.g.orthologs)of any of the proteins recited above; or (e) apolynucleotidethatencodesa 
polypeptidecomprisinga specific domain or truncation of any of the polypeptides comprising an 

25 amino acid sequence set forfli in Ihe Sequence Listing. 

Theisolatedpolypeptidesoftheinventioninclude,butarenotlimitedto,apolypeptide 

comprisinganyofAeaminoaddsequencessetforthinSEQIDNO: 985-1968,2953-3936.3943- 
3948 or 3955-3960; or the conespondingM length or mature protem. Polypeptides of the 
inventionalso include polypeptides wifli biological activity that are encoded by (a) any of flie 
30 polynucleotideshavingamicleotidesequencesetforthmSEQIDNO:l-984, 1969-2952. 3937- 
3942 or 3949-3954; or (b) polynucleotidesftat hybridize to the conqjlementof the polynucleotides 
of(a)understringenthybridi2ationconditions. Biologicallyorimmunologicallyactivevariantsof 
. anyofthepolypeptidesequencesintheSequenceUstmg.and"substantialequival 

(e.g., with at least about65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino add sequence 

35 identity)thatpreferablyretainbiologicalactivi1yaiealsoconten5)latedThepol^ 
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inventionmay be whoUy or partiaUy chemically synthesized but are prcf^ 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypq)tide of the invajtion. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

the invention. 

The invention also relates to mediods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

10 under conditions penUitting expression of the desired polypeptide, and purifying the polypeptide 
ftomflie culture or ftom the host cells. Preferred embodiments mcliide those in which the 
I»otein produced by such process is a mature form of the protein. 

Polynucleotides according to the mvention have numerous qjplications in a variety of 
techniques known to those skiUed in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oUgomers, or primers, for PGR, use for chromosome and gene 
mqqnng, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like- For example, when the expression of an mRNA is 
largely restricted to a particular ceU or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of flie particular cell or tissue mRNA in a sample 

20 xL^ng,e.g.. insituhyhn^zatioa. 

]n other exemplary embodiments, the polynucleotides are used m diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known m the art and 

exemplified by Vollrath et al.. Science 258:52-59 (1992), as expressed sequence tags for physicd 

m^piag of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are usefiil for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

30 mariners, and as a food supplement 

Methods are also provided for preventing, treating, or ameUorating a medical condition 
which comprises the step of administering to a mammaUan subject a therapeutically effective 
amount of a conqxjsition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 
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In particular, flie polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity . 

The present invention further relates to methods for detecting tiie presence of tiie 

5 polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention m a sample, comprising contacting 
the sample with a cx)mpound that binds to and forms a complex with the polynucleotide of 

10 interest for a period suflScient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample witii a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 

1 5 and detecting the formation of tiie complex such that if a complex is formed, the polypeptide is 
detected. 

The inveiition also provides kits comprismg polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the inventioiL 
Furthermore, the invention provides metiaods for evaluating the efficacy of drugs, and 
20 monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds tiiat modulate 
(i.e., increase or decrease) flie e3q)ression or activity of the polynucleotides and/or polypeptides 
of tiie invention. Such metiiods can be utilized, for example, for the identification of compounds 

25 that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 

30 complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 

detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
35 administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
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symptoms or tendencies. In aMtion, the invention encompasses metho<b for treating di^^ 
disorders as recited herein comprismg administering compounds and other substances that 
modulate the overaU activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene^tein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
usefiil for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
10 family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, then the 

polypeptides and polynucleotides of the present invention are useful for a variety of applications, 
as described herein^ including use in arrays for detection. 

4. DETAILED DESOUPTION OF THE INVENTION 

15 

4J DEFINITIONS 

It must be noted that as used herein and in the appended claims, the smgular forms "a" 
"an" and *the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
20 and/or immunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms "biologically active" or '^biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "unmunologically active" or "unmunological activity" refers to the capability of the 
natural, recombinant or syntiiietic polypeptide to induce a specific immune re^ 
25 appropriate aiiimals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural bmdii^ of 
30 polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 

complementary sequence 3'-TCA-5'. Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
35 strength offlie hybridization between &e nucleic acid strands. 
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The term "embryonic stem cells (BSf refers to a cell fliat can give rise to many 
differraitiated cell ^pes in an embryo or an adult, including the geam cells. The term "gram line 
stran cells (GSCs)" refers to stem cells derived from primordial stem cells Aat provide a steady 
and continuous source of germ cells for the production of gametes. The term '^primordial germ 
5 cells (PGCsf refers to a small population of cells set aside fiom other cell lineages particularly 
from the yoUc sac, mesenteries, or gonadal ridges during embryogenesis that have Ae potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES ceUs are capable of self-renewal. Thus these ceUs 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that. 

10 comprise the adult specialized organs, but are able to regenerate themselves. 

The term '•e3q)ression modulating fragment," EMF, means a series of nucleotides which 
modulates the raqpression of an operably lioked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of flie EMF. EMFs 

15 include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elemraits). One class ofEMFs are nudrac add fiagmaits which induce the ejqiression of an 
operably linked ORF in response to a specific regulatory feu:tor or phyaological evoit 

The terms ^nucleotide sequence" or "nudeic add" or "polynudeotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

20 sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synlhetic 
origm whidi may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucldc acid O^A) or to any DNA-like or KNA-like material. Intiie 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine andNisA,C,GorT 
(U). It is contemplated that where tiie polynudeotide is RNA, the T (thymine) in the sequences 

25 provided herein is substituted with U (uradl). Generally, nucldc acid segments provided by this 
invention may be assembled from fragments of tiie genome and short oligonucleotide Unkexs, or 
from a series of oligonucleotides, or fiom individual nucleotides, to provide a synlhetic nuddc 
acid v^^ich is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

30 The terais "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segmmt" or "probe" or "primer" are used interchangeably and refer to a sequoice of nucleotide 
residues v\*ich are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more prefraably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nudeotides. The fragment is preferably less than about 500 

35 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less fbm 30. 
nucleotides. Preforably the probe is from about 6 nucleotides to about 200 nucleotides, 
IMeferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably fiom about 20 to 25 nucleotides. Preferably fbe fragments can 
5 be used in polymerase chain reaction (PGR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

10 Probes may, for example, be used to determine whether specific mKNA molecules are 

pies^ in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PGR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fOl-in reaction, PGR, or otiier methods well known in the 
art Probes of Ae inesent invention, thdr preparation and/or labeling are elaborated in 

15 Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Gold Spring Harbor 
Laboratory, NY; or Ausubel, TM. et al., 1 989, Current Protocols in Molecular Biology, John 
Wil^ & Sons, New York NY, botii of which are incorporated herein by reference in thrar 
entirety. 

The nudeic acid sequences of the present invention also include the sequence 

20 information from the nucleic acid sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954. Tbe sequence information can be a segmoit of any one of SEQ ID NO:l-l-984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or lepreseots the sequence 
information of that sequaice of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic add sequraice because the probability that a twenty- 

25 met is fully matched in tiie human gaiome is 1 in 300. In the human genome, these are tiiree 
billion base pairs in one set of chromosomes. Because 4^ posable twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be folly matched in the human genome is 
^proximately 1 in 5. When these segments are used in arrays for expression studies, fifleen- 

30 mer segments can be used The probability that the fifreen-mer is faUy matched in tiie expressed 
sequences is also approximately one in five because e3q>ressed sequences comprise less than 
proximately 5% of tiie entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be atwenty-fivemar. Theprobabilitythattibetwenty-fivemerwouldappearin ahumangenome 

35 withasingIeniismatchiscalcdatedbymultiplyingtiieprobabiIityforafiiUmatch(l-^4^to 
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inaeasedprobd)ilityfi»niismatchateachnucleotideposition(3 x25). Theprobabililythatan 
eighteenmer a single mismaldi can be detected in an anay f^^ 
approximately one in five. ITie probability that a twenty-merwifli a single mismatch can be 
detected in a human genome is approximately one in five. 
5 The term "open reading fiame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to fimctionally related nucleic 
add sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 

10 linked nucleic add sequences can be contiguous and in the same reading fiame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but stiU control 
transcription/translatioa of the coding sequence. 

The tenn "pluripotenf ' refers to flie capability of a cell to differentiate into a number of 
differtaatiated cell types fhat are present in an adult organism. A pluripotent cell is restricted in its 

IS di£reraatiationc^)ability in contparison to a totipotent ceU. 

The terms •polypeptide" or ''peptide" or "amino acid sequmce" refisr to an oligopeptide, 
pq»tide, polypeptide or protein sequence or fiagment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretdi of amino 
add residues of at least about 5 amino adds, preferably at least about 7 amino acids, more 

20 preferably at least about 9 amino adds and most preferably at least about 17 or more amino 
adds. The peptide preferably is not greater than about 500 amino adds, more preferably less 
than 200 amino acids more preferably less tiian 150 amino adds and most preferably less 4an 
100 amino acids. Preferably the peptide is fi»m about 5 to about 200 amino adds. To be active, 
any polypeptide must have sufBdent length to display biological and/or immunological activity. 

25 The tram "naturaUy occurring polypeptide" refers to polypeptides produced by cells that 

have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidadon and acylation. 

The term "translated protein coding portion" means a sequoice vMch encodes for the fiill 

30 length protein which may include any leader sequence or any processmg sequence. 

The term "mature protdn coding sequence" means a sequence which encodes a peptide 
or protein witiiout a signal or leader sequence. The "mature protein portion" means ibat portion 
of the protean yMck does not mclude a signal or leader sequence. The peptide may have been 
produced by processing inihe cell vMch removes any leadei/signal sequence. The mature 

35 protem portion may or may not include the initial mefliionine residue. The methionine residue 
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may be removed from the protein during processing in the celL The peptide may be produced 
syntheticaUy or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The tern "derivative" refers to polypeptides chemically modified by such techniques as 
5 ubiquitination. labeling (e.g., with radionucUdes or various enzymes), covalait polymer 
attachment such as pegylation (derivatization with polyeAylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normaliy occur 
in human proteins. 

The term "variflnt"(or "analog") refers to any polypeptide differing from naturally 

10 occurring polypeptides by amino add msertions, deletions, and substitutions, created using, e g., 
recombinant DKA techniques. Guidance in detennining which amino add residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence dwnges made in re^ons of hi^ homology (conserved regions) 

15 or by replacing amino adds with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction ates, may be 
introduced to optimize cloning into a plasmid or viral vector or eiqjression in a particular 

20 prokaryotic or eukaiyotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as Ugand-binding afBnities, interdiain 
affinities, or degradation/tumover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

25 another ammo add having similar structural and/or chemical properties, le. , conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on &e basis of 
similarity in polarity, charge, solubiUty, hydrophobidty, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For exan5>le, ndnpolar (liydrophobic) amino adds include 
alanine, leudne, isoleudne, valine, proline, phenylalanine, tryptophan, and methionine; polar 

30 neutral amino adds inchide glydne, serine, Ihreonme, cysteine, tyrosine, asparaginss, and 
glutamine; poativdy charged (basic) amino adds include arginine, lysine, and Wstidine; and 
negatively charged (addic) armno acids include aspartic add and ghitamic add- "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be ejij)erimeBtally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, wdiere alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
5 can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding afiSnities, mterchain aflSnities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 

10 cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, eg., polynucleotides, proteins, and the like. In one embodunent, the 

15 polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The temi "isolated" as used herein refers to a nucleic acid or polypeptide separated firom 

20 at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one raibodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic adds or 
polypeptides present in their natural source. 

25 The term "recombinant," when used herein to refer to a polypeptide or protein, means 

that a polypeptide or protein is derived firom recombinant (e.g., microbial, insect, or m amm a lian ) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made m 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially fi-ee of native endogenous substances and 

30 unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, eg., E. colU will be free of glycosylation modifications; polypq)tides or 
proteins oqjressed in yeast will have a glycosylation pattern m general different from those 
expressed in mammalian cells. 

The term "recombmant e^qjression vehicle or vector" refers to a plasmid or phage or vhrus 

35 or vector, for expressing a polypq)tidefi»m a DNA (RNA) sequence. An expression v^^ 
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comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mKNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic ejqiression systems preferably include a leader sequence enabling 
exteacellular secretion of translated protein by a host celL Alternatively, where recombinant 
protem is e;q)ressed without a leada: or transport sequence, it may include an amino terminal 
methionine residue, niis residue may or may not be subsequently cleaved fiom the ejcpressed 
recombinant protein to provide a final product 

10 The tain "recombinant expression system" means host cells vMch have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or cany the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herem will 
express heterologous polypeptides or proteins upon indiKtion of tiie regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant graietic elemait or elements having a regulatory role m 
gene expression, for example, promoters or enhancers. Reoomhniant expression systems as 
defined herein will express polypeptides or proteins endogenous to the ceU upon induction of Ac 
regulatory elements linked to the endogenous DNA segment or gene to be ejqpressed. The cells 
can be prokaryotic or eukaryotic. 

20 The term "secreted" includes a protein that is transported across or flirough a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is ejqMressed 
in a suitable host cell. "Secreted" proteins mclude without liriiitation proteins secreted wholly 
(e.g., soluble proteins) or partially (,e.g., receptors) firom the cell in vMch they are expressed. 
"Secreted" protems also include without limitation proteins that are transported across the 

25 membrane of flie endoplasmic reticulum. "Secreted" proteins are also intaided to include 

proteins containing non-^ical signal sequences (e.g. Interleukin-l Beta, see Krasney, P.A. and 
Young, PJL (1992) Cytokine 4(2):134 -143) and fectors released from damaged cells (e.g. 
Interleukin-l Receptor Antagonist, see Arend, W.P. eL al. (1998) Annu. Rev. Immunol. 
16:27-55) 

30 Where desired, an ejqwession vector m^ be designed to contain a "agnal or leader 

sequence" which will direct the polypq)tide through the membrane of a celL Such a sequence 
may be naturally present on the polypeptides of the present invention or provided firom 
heterologous protdn sources by recombirjant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art as stringent. Stringent conditions can mclude highly stringent conditions (i.e., hybridization 



12 



10 



wo 01/57190 PCT/DS01/040W 
to filter-bound DNA in 0.5 M NaHP04, 7% sodium dodecyl sulfete (SDS), 1 mM EDTA at 
eS'C, and washing in O.IX SSC/0.1% SDS at eS'C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42"C). Other exemplary hybridization conditions are 
described herein in the examples. 

In instances of hybridi2ation of deoxyoligonucleotides, additional exemplary strir^ent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48«»C (for 17-base oUgos), 55°C (for 20-base oUgonucleotides), and 
60**C (for 23-base oligonucleotides). 

As used herein, "substantiaBy equivalent" can refer bofli to nucleotide and amino acid 
sequences, for example a mutant sequraice, that varies ftom a referraice sequence by one ox more 
substitutions, deletions, or additions, the net effect of \Aich does not result in an adverse 
functional dissimilarity between the reference and subject sequences. T^ically, such a 
substantially equivalent sequence varies from one of ti^ose listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 
15 substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by die total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the tisted sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more flian 30% (70% sequence identity); in a variation of this embodiment, 
20 by no more tiian 25% (75% sequence identity); and m a further variation of tiiis embodiment, by 
no more than 20% (80% sequence identity) and in a furtha: variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of tiiis embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, c.g^., mutant, amino acid 
sequences according to tiie invention preferably have at least 80% sequence identity witii a listed 
25 amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity and most preferably at least 98% idenity. SubstantiaUy equivalent 
nucleotide sequences of tiie invention can have lower percent sequence identities, taking into 
account, for exanq)le, tiie redundancy or degeneracy of die genetic code. Preferably, nucleotide 
30 sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identify, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For tiie purposes of tiie 
present invention, sequences having substantiaUy equivalent biological activity and substantially 
35 equivalent expression characteristics are considered substantially equivalent For tiie purposes of 
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detennining equivalence, truncation of the mature sequence {e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
5 hybridization conditions. 

The tenn 'totipotenr refers to the c^abiUty of a cell to differentiate into all of the ceU 
types of an adult organism. 

The tenn "transformation" means introducing DNA into a suitable host cell so that tiie 
DNA is leplicable, eitiier as an extrachromosomal element, or by chromosomal int^ration. The 
10 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in feet expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used hereiD, an "iq»take modulating fiagment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fiagment into a ceU. UMFs can be readily identified 
15 using known UMFs as a target sequence or target motifwith the computer-based systeins 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resultmg nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 

20 marker sequence. 

Each of the above terms is meant to encompass all tot is described for eadi, unless the 

context dictates otherwise. 

4 J NUCLEIC ACIDS OF TBDE INVENTION 

25 Nucleotide sequences ofthe invention are set forfli in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952. 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence raicoding the 

30 mature protem coding sequence of the polypeptides of any one of SEQ ID NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invoition also mclude, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complementof any ofihe nucleotides sequences of SEQ ID NO: 1-984, 1969-2952,3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936. 3943-3948 or 3955-3960; (c) a 
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polynucleotide wbich is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide v^ch encodes a species homolog of any of tihie proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
5 interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domains in immunoglobulin-iike proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
10 domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and KNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

15 The present invention also provides genes cbrrespondingtp tiie cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with lmownmeflK)dsus^^ 
sequenceinfonnationdisclosedherein. Such methods mclude the preparationofprobes or primers 
fromthedisclosedsequenceinformationforidentificationand/orampKficationof gm^ 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

20 be obtained using mefliods known mtiie art For example, fiilllengflicDNA or genomic DNA that 
corresponds to any of the polynucleotidesof SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable 
hybridizationconditionsusinganyofthe polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954 or a portion thereof as a probe. Alternatively, the polynucleotidesof SEQ ID 

25 NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 may be used as the basis for suitable primer(s) 
that allow identification and/or amplificationof genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled fi-om ESTs and sequences 
(mcluding cDN A and genomic sequences) obtained firom one or more public databases, such as 
30 dbEST,gbpri,andUniGene. The EST sequences can provide identifying sequraice information, 
representativefiagment or segment information, or novel segment informationfor the Ml-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences tiiat are substantially equivalent to the polynucleotides recited above. Polynucleotides 
35 according to the invention can have, c.g., at least about 65%, at least about 70%,^ 
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750/0, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%. 
88%, 89%, and more typicaUy at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited 
above. 

Included within the scope of the nucleic acid sequences of Ihe invention are nucleic acid 
sequence fiagments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof; vMch 
fiagment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 
20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 
polynucleotides of the invention) are contemplated. Probes c^le of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of flie invention fmm other 
polynucleotide sequences in the same family of genes or can differentiate human genes firom 
genes of other species, and are preferably based on unique nucleotide sequences. 

The sequences felling wilhinihe scope of Ae present invention are not Hinited to these 

specific sequences, but also include allelic and spedes variations thereof. Allehc and species 
variations can be routinely detenninedby comparingthe sequenceprovided SEQ ID NO: 1-984. 

1 969-2952, 3937-3942 or 3949-3954, a representativefiagmentftereol^ or a nucleotide sequence at 
least 90% identical,preferably95% identical, to SEQ ID NO: l-984,1969-2952,3937-3942or 

3949-3954 with a sequence from another isolate of the same species. Furlhermore,to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino add 
sequences as do the specific ORFs disdosed herein. In other words, in the coding region of an 
ORF, substitudonof onecodonfor anolher codon that encodes the same amino acid is expressly 
contemplated. 

The nearestndghboror homology result for the nucleic adds of the present invention, 
mcludingSEQIDNO: 1-984. l969-2952,3937-3942or3949-3954,canbeobtamedby searchinga 
databaseusinganalgoritiimoraprogram. Inferably, a BLAST T^ch stands for Basic Local 
Alignment Search Tool is used to search for local sequence aUgmnents (Altshul, S.F. J Mol. Evol. 
36 290-300 (1993) and Altschul SJF. et al. J. Mol. Biol. 21 :403-410 (1990)). Akematively a 
FASTA version 3 seardi against Gei^pept, usmg Fastjqr algorithm. 

Spedes homologs (or orthologs) of the disclosed polynucleotides and proteins arc also 
provided by the present invention. Spedes homologs may be isolated and identified by making 
suitable probes or primers fiom the sequences provided herdn and screening a suitable nucldc 
add source finm the desired species. 
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TTie invention also encompasses alleUc variants of the disclosed polynucleotides or 
proteins; that is, naturaUy-occurring alternative forms of the isolated polynucleotide which also 
encode proteins vAach are identical, homologous or related to that encoded by the 
polynucleotides. 

5 The nucleic acid sequences of the invention are further directed to sequences v^hich 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known m tiie art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: tiie location of tiie mutation and the nature of tiie mutation. Nucleic acids 

10 encoding the amino add sequence variants are preferably constructed by mutating tiie 

polynucleotide to encode an amino add sequence tiiat does not occur m nature. These nucldc 
add aheaations can be made at sites tiiat differ in flie nucldc adds fiom different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substitiiting first with conservative dioices (e.g., 

15 hydrophobic ammo add to a different hydrophobic amino acid) and tiien with more distant 

choices (e.g., hydrophobic amino add to a diarged amino add), and tiien ddetions or insertions 
may be made at flie target site. Amino acid sequence deletions generally range firom about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions indude ammo- and/or carboxyl-terminal fusions ranging m length firom one to one 

20 hundred or more residues, as well as intrasequence insertions of single or multiple amino add 
residues. Intrasequence insertions may range generally fix)m about 1 to 10 anuno residues, 
preferably fiom 1 to 5 residues. Examples of terminal insertions indude tiie heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences usefiil for purifying tiie expressed protem. 

25 In a preferred metiiod, polynudeotides encoding tiie novel amino acid sequences are 

changed via site-directed mutagenesis. This metiiod uses oUgonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides onbotii sides of tiie changed amino acid to fomi a stable duplex on eitiier side of tiie 
site of being changed. In general, tiie techniques of site-directed mutagenesis are weU known to 

30 those of skill in tiie art and tins tedmique is exemplified by pubUcations such as, Edehnan et al., 
DNA 2:183 (1983). A versatile and efiBcient metiiod for pioducmg site-specific changes in a 
polynucleotide sequence was published by ZoUer and Smitii, Nucleic Acids Res. 10:6487-6500 
(1982). PGR may also be used to create amino add sequence variants of the novd nudeic adds. 
When small amounts of template DNA are used as starting material, primers) fliat differs 

35 slightiy in sequence fiom flie correspondmg region in flie teoaplate DNA can generate flie desired 
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amino acid variant. PGR amplification results in a population of product DNA firagmraits that 
differ from the polynucleotide template encoding ±e polypeptide at the position specified by. the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in WeUs et al.. Gene 34:3 1 5 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubd et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino add sequmce may be used in the practice of the invention for the cloning and expression 
ofthese novel nucleic adds. SuchDNAsequencesincludethosevAich are equable of 
hybridi2ang to the appropriate novel nuddc add sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to gaierate polynucleotides encoding chimeric or fiisioii proteins comprisii^ one or more 
1 5 domains of the invention and heterologous protein sequences. 

The polynucleotides of Ae invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, dDNA, anq)lified, or 
synthetic) or KNA. Methods and algorithms for obtaining sudi polynucleotides are weU kno 

to those of skill in the art and can include, for example, methods for determining hybridization 
20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the expression of that nucleic acid, or a fimctional equivalent thereof 
25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynudeotide according to the invention can be j oined to any of a variety of other 
nucleotide sequences by weU-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

30 nucleotide sequaaces for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, liimbda phage derivativesi phagemids, and ihe like, that are weU known in the 
art Accordingly, the invoition also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of repKcation functional in at least one organism, convenient restiriction endonuclease sites, and a 

35 selectable markCT for tiie host cell. Vectors accordmg to the invention indude ejqiression 
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vectors, repUcation vectors, probe generation vectors, and sequencing vectors. A host ceU 
according to Ihe invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodunent, the 
recombinant conslructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a forward or reversie 
orientation. In the case of a vector comprising one of the ORFs of the present invention, the 
vector may further comprise regulatory sequences, mcluding for example, a promoter, operably 
Mnked to the ORF. Large numbers of suitable vectors and promoters are known to those of skill 
in the art and are commerdaUy available for generating ibe recombiiiant constructs of the present 
invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, 
PsiX174, pBhiescript SK, pBs KS, pNH8a, pNH16a, pNHlSa, pNH46a (StraJagene); pTrc99A, 
pKK223-3, pKK233-3, pDR540, pRrr5 (Phannacia). Eukaryotic: pWLneo, pSV2cat, pOG44. 
PXn, pSG (Stiatagene) pSVK3, pBPV, pMSG, pSVL CPharmada). 

The isolate^ polynucleotide of the invaition may be operably linked Id an raqjression 
control sequence such as Ihe pMT2 or pED expression vectors disclosed in Kaufinan et al. 
Nucleic Acids Res. 19, 4485-4490 (1991), in oider to produce the protein recombinantly. Many 
suitable expression control sequences are known in Ihe art General methods of expressing 
recombinant proteins are also known and are exenq>lified in R. Kaufinaa MeOiods in 
Emymology 1 85, 537-566 (1990). As defined herdn "operably linked" means that &e isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protem is expressed by a host ceU which has been transformed 
(transfected) with the Hgated polynucleotide/e3q)ression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lacI, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters mclude CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the q>propriate vector and promoter is well within the level of ordinaiy skill in the art. 
Generally, recombinant expression vectors will include origms of repUcation and selectable 
markers permittmg transformation of the host cell, e.g., Ihe ampicillin resistance gene of ^. coli- 
and S. cerevisiae TOPI gene, and a promoter derived from a highly-expressed gene to direct 
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transcriptioD of a doTvnstream structural sequence. Sudi promoters can be d«ived fiom opeions 
encoding glycolytic en2ymes such as 3-phoq>hoglycerate kinase CPGK), a-fector, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fiision protem including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 

10 sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase witii a fiinctional promoter. The vector wiU comprise one or 
more phenotypic selectable maricers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification witiiin the host Suitable prokaryotic hosts for 
transformation include R colU Bacillus subtilis, Salmohella typhimurium and various species 

15 within the genera Pseudomonas, Streptomyces, and Stcphylococcus, althoug^i ofliers may also be 
employed as a matter of choice. 

As a rq)resentative but non-limiting example, usefijl expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of repUcation derived &om commCTcially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 

20 (ATCC 37017). Such commercial vectors include, for exaraple, pKK223-3 ^armacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growtii of flie 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 appropriate means (eg. , temperature shift or chemical induction) and ceUs are cultured for an 
additional period. CeUs are typically harvested by centrifagation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce unmune responses. For 
example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 

30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical adnoinistration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expresaon vector and may be in the form of 
naked DNA. 

35 
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.43 ANnSENSE 

Another aspect of the invention pertains to isolated antisense nucleic add molecules tiiat 
are hybridizable to or complementary to flie nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 3949-3954, or ftagments, analogs or 

5 derivatives thereof An "antisense" nucleic acid comprises a nucleotide sequence tiiat is 

complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to tiie coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 

10 strand, or to only a portion tiiereof Nucleic acid molecules encoding fragments, homologs, 
draivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementaiy to a nucleic acid sequence of SEQ ID NO: 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodimoit, an antisense nucleic add molecule is antisense to a "coding region" 

15 of the coding strand of a nucleotide sequence of the invention. The term "coding r^on" refers 
to tiie region of tiie nucleotide sequence comprising codrais which are translated into amino add 
residues. In another embodiment, the antisense nucldc add molecule is antisoise to a 
"noncodin^ region? of the coding strand of a nucleotide sequraace of the invention. The tram 
"noncoding region" refers to 5' and 3' sequences which flank tiie coding region tiiat are not 

20 translated into amino adds (1 c, also referred to as 5' and 3' unlianslated regions). 

Given the coding strand sequences encoding a nucldc add disdosed herdn (e.g., SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954), antisense nucldc adds of tiie inventi<m can be 
designed according to tiie rules of Watson and Cridc or Hoogsteen base pairing. The antisense 
nucldc acid molecule can be complementary to tiie entire coding region of a mRNA, but more 

25 preferably is an oUgonucleotide tiiat is antisense to only a portion of tiie coding or noncoding 
region of a mRNA. For example, tiie antisense oUgonucleotide can be complementary to the 
region surrounding tiie translation start site of a mRNA. An antisense oligonucleotide can be, for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in lengtii. An antisense nudeic 
acid of tiie invention can be constructed using chemical syntiiesis or enzymatic Ugation reactions 

30 using procedures known in tiie art For example, an antisense nucleic acid (e.g. , an antisense 
oligonucleotide) can be chemically syntiiesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of tiie molecules or to 
increase tiie physical stability of tiie diqilex fiarmed between the antisense and sense nudeic 
adds, eg., phosphorothioate derivatives and acridine substitiited nudeotides can be used. 
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Examples of modified nucleotides can be used to generate the antisense nuddc acid 
include: 5-fluorouracil, S-hromouradl, 5-dilorouracil, 5-iodouradl, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-caAoxymethylaminomeliiyl- 
2-1hiouridine, 5-carboxymethylaminomethyluracil, dihydrouradl, beta-D-galactosylqueosine, 
5 inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2>dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouraca, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluiacil, 5-methoxyuracil, 

2- melhyWiio-N6-isopentenyladenine, uiacil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5-metJiyl-2-fliiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, uracfl-5-oxyacetic acid (v), 5-methyl-2-fliiouracil, 

3- (3-anuno-3-N-2-carboxypropyl) uradl, (aq»3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic add can be produced biologically using an expression vector into which a 
nucldc acid has been subcloned in an antisense orientation (Le. , RNA transcribed fiom tiie 

15 inserted nucldc acid wiU be of an antisense orientation to a target nucleic add of interest 
described further in the following subsection). 

The antisense nucldc acid molecules of tibe invention are typically administMed to a 
subject or generated in situ sudi that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a jaoian according to the invention to hereby mhibit expression of the 

20 protein, e.g. , by inhibiting transcription and/or translation. TTie hybridization can be by 

conventional nucleotide complemeaitarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, tiirough specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nudeic acid molecules of the invention includes direct injection at a tissue site. Altemativdy, 

25 antisense nucldc acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens e)q)ressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herdn. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in vMch the antisense nucldc acid molecule is placed under the 
control of a strong pol n or pol ni promoter are prefecced. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic add molecule. An a-anomeaic nucldc acid molecule foims specific 

35 double-stranded hybrids with complementary RNA in wlridi, contrary to the usual p-units, the 
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Strands run paraUel to each other (Gaultier et d. (1987) Nucleic Acids Res 1 5: 6625-6641). Hie 
antisense nucleic add molecule can also comprise a 2'-o-methylribonucleotide (Inoue et dl. 
(1987) Nucleic Adds Res 15: 6131-6148) or a dnmeric RNA -DNA analogue (Inoue et dl. (1987) 
215: 327-330). 

5 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the mvention is a ribozyme. 
Rib02ymes are catalytic KNA molecules with ribonuclease activity that are capable of cleaving a 
-single-stranded nucleic acid, such as a mRNA, to v*ich they have a complementary region. 

10 Thus, ribozymes {e.g., hammerhead ribozymes (described in Haselhofif and Gerlach (1988) 

Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic add of the mvention can be 
. designed based upon the nucleotide sequence of a DNA disclosed herein (i.c., SEQ ID NO: 1- 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 

15 IVS RNA can be constructed in viWdiIhe nucleotide sequence of the active site is 

complementary to the nucleotide sequence to be deaved in a SECX-encoding mRNA. See, e.g., 
C&AietaL U.S. Pat No. 4,987,071; and Ceche/ a/. U.S.PaL No. 5.116,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonudease activity tcom 
a pool of RNA mole«ailes. See, e.g.,Bartd fl/., (1993) Science 261:1411-1418. 

20 Altemativdy. gene expression can be inhibited by targeting nucleotide sequences 

complementary to the legulatoiy n^gion (e.^:, promoter and/or enhancers) to form triple heKcal 
structures tiiat prevent transcription of tiie gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N. Y. Acad Sci. 660:27-36; and 
Maher (1992) ^wosjoyj 14: 807-15. 

25 In various embodunents, the nucldc adds of the mvention can be modified at the base 

moiety, sugar moiety or phosphate backbone to improve, e.g., tiie stabiUty, hybridization, or 
solubiUty of tiie molecule. For example, flie deoxyribose pho^hate backbone of the nucldc 
adds can be modified to generate peptide nucleic adds (see Hyrup et dl. (1996) Bioorg Med 
Chem 4: 5-23). As used herein, tiie terms "peptide nucldc acids" or "PNAs" refer to nucldc add 

30 mimics, e.g., DNA mimics, m which tiie deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only tiie four natural nudeobases are retained. Hie neutial 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic stiengtii. The syntiiesis of PNA oUgomers can be performed using 
standard solid phase peptide syntiiesis protocols as described in Hyrup et al. (1996) above; 

35 Peny-O'Keefe et al. (1996) PNAS 93: 14670-675. 
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PNAs of tiie invention can be used in ftenqpeutic and diagnostic plications. Fox 
example, PNAs can be used as antisense or antigene agents for sequence-s^jedfic modulation of 
geost expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of tiie invention can also be used, «.g., in tiie analysis of sii^e base pair mutations in a 
5 gene by, e.g., PNA directed PGR clamping; as artificial restriction enzymes when used in 
combination with otiier enzymes, e.g., SI nucleases (Hyrup B. (199Q above); or as probes or 
primers for DNA sequence iand hybridization (Hyrup et al. (1996), above; Peiry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 

10 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of Uposomes or other techniques of drug 
delivery known in Ihe art For example, PNA-DNA chimeras can be generated that may 
combine tiie advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to mteract with the DNA portion while the PNA 

15 portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of ^Jpropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can beperfonned as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 
3357-63. For ocamiile, a DNA chain can be synthesized on a solid support using standard 

20 phosphoramiditecov5>lingchamstry, and modified nucleoside analogs, eg., 

5'-(4-metiioxytrityl)amino-5'-deoxy-tiiymidme phosphoramidite, can be used between the PNA 
andtije 5' endof DNA(Mager<i/.(1989)iVt«:/.4cirfi?« 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a cMmraic molecule witii a 5' PNA segment and a 3' 
DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can be syntiiesized 

25 witii a 5* DNA segment and a 3' PNA segment See, Petersen e/fl/. (1975) BfoorgAferfC/icm 
Zetf 5: 1119-11124. 

In other embodiments, the ohgonucleotide may include other upended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents &cilitating transport across the 
cell membrane (see. e.g., Letsinger et al., 1989, Proc. Natl Acad Sd USjI. 86:6553-6556; 
30 Lemaitre et al, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT PubUcationNo. W088/098I0) or 
• die blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 

oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al., 1988, BioTechmques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539^549). To tiiis end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4^ HOSTS 

The pesent invention further provides host ceUs genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention intooduced into the host ceU using known transformation, transfection or infection 
methods. The present invention still fiirlher provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to tiie host cell vMch drives expression of the 
polynucleotides in tiie celL 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
mcrease, e3q)iession of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide intaeased polypeptide expression by replarang, in whole or in part, the 
naturaUy occurring promoter witii all or part of a heterologous promoter so that tiie cells express 
die polypeptide at higher levels. The heterologous promoter is inserted in sudi a mamier that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International PubHcationNo. WO92A20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to tire codmg 
sequence, amplification of the marker DNA by standard selection methods results m co- 
amplification of the desired protein codmg sequences in the cells. 

The host cell can be a higher eukaryotic host ceU, such as a mammaUan cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic ceU, such as a 

bactraial cell Introduction of the recombinant construct into die host cell can be effected by 

calcium phosphate transfection, DEAE, dextian mediated transfection, or electroporation (Davis, 

L. et al.,Basic Methods in Molecular Biology (1986)). The host cells containmg one of tiie 

polyimcleotides of die invention, can be used in conventional manners to produce the gene, 

product orcoded by flie isolated fiagment (in die case of an ORF) or can be used to produce a 

heterologous protem under the control of the EMF. 

Any host^ector system can be used to e:q)ress one or more of tiie ORFs of die present 

invention. These mchide, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 ceU, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as R coli and B. siibHlis. 
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The most preferred ceDs are those which do not normaUy express the particular polypeptide or 
protein or which ejqjresses the polypeptide or protein at low natural level. Mature protems can 
be e3g)ressed in mammalian cells, yeast, bacteria, or other cells under the control of ^jpropriate 
promoters. Cell-free translation systems can also be employed to produce such protems usmg 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaiyotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboiatoiy Manual, Second Edition, Cold Spring Harbor, New 
Yoric (1989), flie disclosure of \^*ich is hereby mcoiporated by reference. 

Various T"«tntnalian cell culture systems can also be employed to express recombinant 
protem. Examples of mammaUan expression systems include the COS-7 hnes of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell hnes capable of expressmg a 
compatible vector are, for example, the C127, monkey COS cells, Chmese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidennal A431 cells, human Colo205 cells, 3T3 
ceUs, CV-1 ceUs, other transformed primate ceU hnes, normal diploid cells, cefl strams derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
HL-60, U937, HaK or Jurkat cells, Mammalian expression vectors will comprise an origjn of 
replication, a suitable promoter and also any necessary ribosome bmding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termmation sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 vhal genome, for example, 
SV40 origin, early promoter, enhancer, spUce, and polyadenylation shes may be used to provide 
the required nontranscribed genetic elements. Recombmant polypeptides and protems produced 
in bacterial culture are usually isolated by mitial extraction from cell peUets, foUowed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protem 
refoldmg steps can be used, as necessary, m completing configuration of the mature protem. 
Fuially, high performance Uquid chromatography (HPLC) can be employed for final purification 
steps. Microbial ceUs employed m expression of protems can be disrupted by any convenient 
method, mcludmg freeze-thaw cychng, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protem in lower eukaryotes such as yeast 
or msects or mprokaryotes such as bacteria. Potentially suitable yeast strams include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strams, Candida, or 
any yeast stram c^le of expressmg heterologous protems. Potentially suitable bacterial 
strams mdude Escherichia coli. Bacillus subtUis. Salmonella typfumurium, or any bacterial 
stiam enable of expressing heterologous protems. If the protem is made m yeast or bacteria, it 
may be necessary to modify the protem produced therem, for exanq)le by phosphorylation or 
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glycosylation of the appropriate sites, in order to obtain the fonctional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be rq)iaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a differait 
gene or anovd regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Ahematively, sequences which a£fectthe structure or stabiUty 
of flie RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylationsignab,mRNAstrf>iUty elements, q)Uce 
sites, leader sequences for enhancing or modiJfying transport or secretion properties of the 
protein, or other sequences vMch alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placmg the 
gene under the control of the new regulatory sequence, e.g.. inserting a new promoter or 
enhancer or both upstream ofagene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer tiiat has broader or different ceU-type specificity than 
the naturally occurring elements. Here, the natuiaUy occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facifitated by 
tiie use of one or more selectable maricer genes that are contiguous witii tiie targeting DNA, 
allowing for the selection of cells in which tiie exogenous DNA has integrated into tiie host cell 
genome. Tbe identification of the targeting event may also be facilitated by tiie use of one or 
more maricer genes exMbiting the property of negative selection, such that tiie negatively 
selectable marker is linked to the exogenous DNA, but configured such tiiat tiie negatively 
selectable marker flanks tiie targeting sequence, and such tiiat a conect homologous 
recombination event witii sequences in flie host ceU genome does not result in tiie stable 
integration oftiie negatively selectable marker. Markers useful fortius purpose mclude tiie 
Herpes Simplex Vinis tiiymidine kinase (TK) gene or tiie bacterial xanthine-guan^ 

phosphoribosyl-transferase (gpt) goie. 
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Hie gene targeting or gene activation techniques vAddi can be used m accordance vwtti 
this aspect of the invention are more particularly described in U.S. Patent No. 5^72,071 to 
Chapel; \5.S. Patent No. 5^78,461 to Sherwin et al.; Ihtemational Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and Intemational Application No. 
5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

Ibe isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding fuU 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

15 the nucleotide sequences set forth in SEQ ID NO: 1-984. 1969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ED NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides diat hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also iriovides biologicaUy active or immunologically active variants of any of the 

20 aminoacidsequencessetforthasSEQIDNO:985-1968,2953-3936,3943-3948or3955-3960 
or the coneqwnding ftiU length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more lypicaUy at least about 85%, 86%, 87%, 88i%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by alleUc variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which arc capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein bindmg sites. 
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The present invention also provides both fuU-length and mature forms (fi)r example, 
xvithout a signal sequence or pie(^r sequence) ofthe disclosed proteins. The protein coding 

sequence is identified in the sequence listing by translation of flie disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a fuU-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of tbe mature form 
of the protein is also determinable ftom the amino acid sequence of the fiill-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the ceU in which tiiey are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceuticaUy acceptable, carrier. 

Tbe present invention fiirther provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nuclac acid fragment of flie present mvention (eg., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, oicode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptide^ or protems of the present invention. At the simplest levd, the amino add 
sequencecanbesynthesizedusingcommerciaUyavailablepeptidesynthesizers. The 

syntheticaUy-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
structural and/or conformational characteristics wift proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producingsmaUpeptidesandfragmentsoflargerpolypeptides. Fragments are use&l, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified tmm 
cellswhichhavebeenalteredtoexpressthedesiredpolypeptideorprotein. As used herein, a ^ 
ceU is said to be altered to express a desired polypeptide or protein A^en the ceU, through genetic 
manipulation, is made to produce apolypeptide or protein which it normally does not produce or 
which the ceU normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and ejq)ressing eilher recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells m order to generate a cell which produces one of the polypeptides 
or protems of tiie present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protdn from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which ahost ceU containing a suitable 
5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
aUow expression of the encoded polypeptide. The polypeptide can be recovered fiiom the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a fuU length or mature form of tiie protein. 

10 In an ahemative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily foUow known 
methods for isolatmg polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. Tbese include, but are not limited to, 
immunochromatography. HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-afBnity chromatogr^y. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
■ Manual; Ausubel et al.. Current Protocob in Molecular Biology. Polypeptide fragments that 
retain biological/iiiimunological activity include fragments comprising greater flian about 100 
amino acids, or grater than about 200 amino acids, and fragments ftat encode specific protein 

20 domains. 

The purified polypeptides can be used m in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. TTjese molecules include but are not 
limited to, for e.g., small molecules, molecules fitjm combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agoiiist 
25 activity in in vivo tissue culture or animal models that are well known in the art In brie^ the 
molecules are titrated into a plurality of ceU cultures or animals and thai tested for either 
cell/ammal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
30 cells. The toxin-binding molecule complex is then targeted to a tumor or other ceU by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protem of fihe invention may also be expressed as a product of tiansgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep vMdi are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protem. 
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Tbs proteins provided herein also indnde protans characterized by annno acid sequences 
sdmilar to those of purified proteins but into which modification are naturally provided or 
deUberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skiUed m fte art using known techniques. Modifications of interest in tiie protein 
5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cystdne 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
wenknovntothoseskilledmtheart(see,e.^.,U.S.Pat.No. 4,518.584). Preferably, such 
10 alteration, substitution, replacement, insertion or deletion retains fte desired activity of the 

protein. Regions of the protein that are important for the protein function can be detennined by 
various methods known in the art mcluding the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine^ntaining variant for biological activity. This .type of analysis determines the 
15 importance of the substitated amino acid(s) m biological activity. Regions of the protein that are 
hnportant forprotein functionnmy be detetmmed by the eMATRK program. 

Other fiagments and derivatives of the sequences of protems which would be expected to 
retam protein activity in whole or in part and are usefiil for screenmg or other immunological 
methodologies may also be easily made by those skilled in the art given Ae disclosures herein. 
20 Siaii modifications are encompassed by tiie present invaition. 

The protem may also be produced by operably linkmg the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovhusfinsect cell expression 
systems are commercially available in kit form Scorn, e.g., Invitrogeo, San Diego, Calif., U.S A. 
25 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, TexasAgriculturalExperimentStationBuUetmNo. 1555 (1987). incorporated herein by 
reference. As used herem, an msect cell capable of expressing a polynucleotide of tiie present 

invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host ceUs under 

30 culture conditions suitable to express the recombinant protein. The resulting expressed proton 
may then be purified jfrom such culture (Le., from culture medium or cell extracts) usmg known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of die protein may also include an afiBnity cohmm containing agents which wiU bind to tiie 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 

35 heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatogr^hy. 

Alternatively, the protein of the invention may also be expressed in a form which will 
fedHtate purificatioa For example, it may be expressed as a fiision protein, such as those of 

5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Phannacia (Piscataway, NJ.) and Invitrogen, 
respectively. Hie protein can also be tagged with an epitope and subsequently purified by uang 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 

10 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance Uqmd chromatogr^hy (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, siHca gel having pendant methyl or other 
aUphatic groups, can be employed to fiirther purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 

15 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. . 

20 Also, analogs of the polypeptides of flie invention embrace fusions of tiie polypeptides or 

modifications of the polypeptides of tiie invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fijsed to the polypeptide or an analog include, for example, targeting moieties which 

25 provide forthedeKveiyofpolypeptide to pancreatic cells, e.g., antibodies to pancreatic ceUs, 
antibodies to immune cells such as T-cells, monocytes, dendritic ceUs, granulocytes, etc., as well 
as receptor and Ugands expressed on pancreatic or immune cells. Other moieties which may be 
fijsed to the polypeptide include tiierapeutic agents which are used for treatment, for example, 
immunosiwressive drugs such as cyclosporin. SK506. azalhioprine, CD3 antibodies and 

30 steroids. Also,polypeptidesnmybefiisedtDimmunemodulatois,andothercytokinessudi 

aljdia or beta interferon. 

4.6.1 DETERMINING POLYPEFTmE AND POLYNUOJEOTIDE roE^^^ 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest msOdi between the 
sequences tested. Methods to 'deteimine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
pevereux, J., et al.. Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
5 University of Wisconsin. Madison, WI), BLASTP, BLASTO, BLASTX, FASTA (Altschul, S J. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al.. Nucleic Adds Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (NeviU- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
10 (Somihammeretal.,NucleicAcidsRes.,Vol.26(l),pp.320-322(1998),hereinincoiporatedby 

reference) and Hbe Kyte-Doolittle hydrophobodty prediction algorithm (J. Mol Biol. 157, pp. 
105-31 (1982), incorporated herein by reference). The BLASTprogiams are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul. S.. et aL. J. Mol. 
15 Biol. 215:403-410 (1990). 

4.7 CHIMEiaC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Witiiin a fusion protemtiie polypeptide according to flie invention can 

20 correspond to all or a portion of a protein according to die invention. In one embodiment, a 
fusion protem comprises at least one biologically active portion of a protein according to tiie 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
portions of a proton according to Ae invention. Wifliin the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and die otiier 

25 polypeptide are fused m-frame to each other. The polypeptide can be fused to die N-tenninus or 

C-terminus. 

For example, in one embodiment a fusion protdn comprises a polypeptide according to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fiision protein is a GST-fiision protem in which the polypeptide 
30 sequences of die mvention are fused to the C-teiminus of the GST Ci-C, glutafliione 

S-transferase) sequences. 

In another embodiment, die ftision protem is an immunoglobulin fusion protein in which 
die polypeptide sequences according to the invention comprise one or more domains fused to 
sequences derived ftom a member of die immunoglobulin protem family. The immunoglobulin 
35 fuaon proteins of die mvention can be incorporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of flie invention 
on the surface of a cell, to thereby suppress signal transduction in vfvo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of tiie 
Ugand/protein interaction may be useful therapeutically for both the treatment of proliferative 
5 and differentiative disorders, e,g.. cancer as weU as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the inomunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a Ugand. 
A chimesric or fusion protein of the invention can be produced by standard recombinant 

10 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
aje ligated together in-frame in accordance wifli conventional techniques, e.g., by enqjloying 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as ^>propiiate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 

15 be synthesized by conventional techniques inchidii^ automated DNA synthesizers. 

Alternatively, PC31 amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) CuwffiNT PROTOCOLS IN MOLECULAR BI0U3GY, John ^ile^ & 

20 Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety {e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such fliat the fiision moiety is linked 
in-frame to the protein of the inventioiL 

25 4.8 GENETEDERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention flius provides gene tharapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 

30 appropriate ceUs is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer metibiods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene ther^ technology see Friedmann, Science, 244: 1275-1281 (1989); Veima, Scientific 

35 AmOTcan: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosoinal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vrvo for therapeutic purposes. 

Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antismse therapy or gene ther^y could be applied to negatively 
regulate the expression of polypeptides of the invention. 

10 Othermethods inhibiting expression of a l>rotein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides offlie present invention can be 
inhibited by using targeted deletion methods, or the inseitionof a negative regidatoiy element such 
as a silencer, i^ch is tissue specific. 

15 The present invention still further provides cells genetically engjneered in vfvo to express the 

polynucleotidesof tiie invention, wherein sudi polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell vdiich drives e3q)ression of the polynucleotides in 
thecell. Thesemethodscanbeusedtoincreaseordecareasetheexpressionofthepoly^^ 
the present invention. 

20 Knowledge of DNA sequencesprovidedby the invention allows for modification of cells to 

permit, increase, or decrease, e5q)ression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypepti de expression by replacing, in \)^iiole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express tiie protem at higher levels. The heterologous promoter is inserted in such a manner tiiat it is 

25 operatively linked to the desired proteia encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International PublicationNo. WO 92/20808, andPCT 
hrtemationalPubKcationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA ^.g., ada, dhfi:, and tiie multifunctional CAD gene which 
encodes carbamyl phosphate qmlhase, aspartate transcarbamylase,and dihydroorotase) and/or 

30 mtronDNAinaybemsertedalongwitiitheheterologouspromoterDNA. Iflinked to tiie desired 
protein coding sequence, amplification of the marker DNA by standard sdection methods results in 
co-amplificationof the desired protein coding sequences in the cells. 

In another embodnnent of the present invention, cells and tissues may be engineered to 
express an endogenous gene con5)risingthe polynucleotides of tiie invention under tiie control of 

35 indudbleiegulatory elements,in wWchcasetiieregulalory sequenceso^^^ endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated fix)m a different gene 
or a novel regulatory sequence synthesized by genetic engineeringmeAods. Such regulatory 
sequences may be compiised of promoters, enhancers, scafFold-attachmentregions, negative 
5 regulatory elements, tianscriptionalinitiationsites, regulatory protein binding sites or combinations 
of said sequences. Alteriiatively,sequences\vbichafFectthestructureorstabilityoftheKNAor 
protein ptoducedmay be replaced, removed, added, or otherwise modified by targeting. These 
sequencesinctadepolyadenylationsignals, rnKNA std)iHty elements, spUce ates, 1^ 
for enhandng or modifying tranqwrt or seaetion properties of thie protdm, or oflia: sequences 

1 0 vMch alter or improve the function or staWlity of protein or RNA molecules. 

The targeting event may be a simple insertion of tiie regulatory sequence, placing the gaie 
under the contail of the new regdatorysequence,e:g.,insertinganewpromoteroreajhancCT^ 
upstreamof agene. Altematively.tiietargetingeventmay be a snnple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element Altematively,the 

15 targetingeventmayreplacean«dstingelementforexample,atissue-spedficenhancercanbe 
replaced by an enhancerthat has broader or different cell-type specificity tiiantte naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequaices are 
added. In all cases, the identification of the targeting evrait may be fecijdtatedby the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

20 ofceUsinwhichtheexogenousDNAhasintegratedintothecellgenome. The identification of tiie 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectablemarker is hnked to the exogenous 
DNA, but configured such tiiat the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration ofthe negatively selectable marker. Markers usefiil for this 
purpose include the Herpes Simplex Virus thymidine kinase (IK) gene or the bacterial 
vflnfhitie -guanine phoq)hatibosvl-tranrferase fept) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect ofthe mvention are more particularly described in U.S. PatentNo. 5,272,071 to Chappel; 

30 U.S. PatentNo. 5,578,461 to Sherwinet al.; hitemationalApplicationNo. PCTAJS92/09627 
(WO93/09222)by Seldenet aL; and IntemationalApplicationNo. PCT/US90/06436 
(WO91/06667)by Skoultchi et al., eadi of which is incorporatedhy reference heran in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to detennine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals ujang homologous recombination [CapeccU, Science 
244:1288-1292 (1989)]. Animals in which the gene is over e;q>ressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
refenred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate Upid metabolism. Transgenic animals, preferably non-human 
mammals, are produced usmg methods as described m U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by refEtence. 

Transgenic ammals can be prepared vi^ierein all or part of a promoter of the 

polynucleotides of the mvention is dther activated or inactivated to alter the level of expression 
of the polypeptides of tiie invention. Inactivation can be carried out using homologous 
recombination metiiods described above. Activation can be achieved by supplementing or even 
replacmg the homologous promoter to provide for increased protein expression. The homologous 
promoter can be su^lemented by msertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
tiirough, e.g., homologous recombmation or knock out strategies, of anhnals that M to express 
polypeptides of tiie invention or tiiat express a variant polypeptide. Such animals are usefiil as 
models for studying the in vivo activities of polypeptide as weU as fijr studying modulators of tiie 
polypeptides of the invention. 

In preferred methods to determine biological functions of tiie polypeptides of the 
invention in vivo, one or more genes provided by tiie mvention are dflier over expressed or 
mactivated in tiie geim line of aniroals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Aiumals in which tiie gene is over expressed, under tiie regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Anhnals m which an endogenous gene has been mactivated by homologous recombination are 
referred to as "knodcout" anhnals. Knodcout anhnals, preferably non-human mammals, can be 
prepared as described m U.S. Patent No. 5.557,032, mcorporated herem by reference. Transgenic 
anhnals are usefiti to determme tiie roles polypeptides of tiie mvention play m biological 

processes, and preferably m disease states. Transgenic animals are usefiil as model systems to 
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identify compounds that modulate Upid metaboUsm. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT . 
Publication No. W094/28122. incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements knovm to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of tiie uses or biological activities (including those associated vnfh assays cited herein) 
identified herein. Usesor activities described for proteihs of flie present invention m^ be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate wheflier tiie , 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment Thus, "tiierapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of tiie invention (including fiill lengfli protein, mature protein and truncations or 
domains tiiereof), or compounds and otiier substances tiiat modulate the overall activity of tiie 
target gene products, eitiier at tiie level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including firagments and 
fiision proteins, antibodies and otiier binding proteins; chemical compounds tiiat directiy or 
indiiecfly activate or inhibit the polypeptides of tiie invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formatiojn and in particular antibodies or otiier binding partners tiiat specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present mvention may likewise be involved in cellular activation 
or in one of the other physiological pathways described h^in. 

4.10.1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present mvendon can be used by flie research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and tiius discover novel, related DNA sequences; as a source of 
information to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in tiie process of discovering oflier novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other si^port, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune reqxnise. Where flie 
polynucleotide encodes a protdn vMch bmds or potentially binds to another protein (such as, for 
example, in a receptor-Ugand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, ftat described in Gyuris et al.. Cell 75:791-803 (1993)) to identify 
polynucleotides encoding tiie otiier protein wifli which binding occurs or to identify inhibitors of 

the binding interaction. 

The polypeptides provided by tiie present mvention can sunilarly be used in assays to 
detennine biological activity, including in a panel of multiple proteins for high-tiiroughput 
screening; to raise antibodies or to elicit anotiier inmiune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of ttie protem (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (eitiier constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or Ugands. 
Protems involved in tiiese bindmg interactions can also be used to screen for peptide or smaU 
molecule inhibitors or agonists of the binding interaction. 

Any or all of tiiese research utiUties are enable of being developed into reagent grade or 
kit format for commercialization research products. 

Methods for performing flie uses listed above are well known to those skilled in the art 
References discloang such mefliods inchide without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed.. Cold Spring Harbor Laboratory Press, Samhrook, J., E. F. Fritsdi 
and T. Maniatis eds., 1989, and "Metiiods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 
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4.10^ NUTRITIONAL USES 

Polynucleotides and polypeptides of the presmt invrartion can also be used as nutritional 
sources or supplements. Such uses include without limitationuse as a protein or amino add 
supplement, use as a carbon source, use as a nitrogen source and use as a soiirce of carbohydrate. In 
5 such cases the polypeptide or polynucleotide of tiie invention can be added to the feed of a 

particular organism or can be administered as a separate solid or Uquid preparation, such as in the 
fonn of powder, pills, solvitions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invaition can be added to the medium in or on which the 
microoi^aDismis cultured. 

10 

4.10 J CYTOlONEANDCEIXPROIJFEKATION/DIFFElffiNTIAT^^ 
ACTIVITY 

A polypeptide of tiie present invention may exhibit activity relating to cytokine, cell 
proliferation (dtiier inducing or mhibiting) or cell differentiation (dtiier inducing or ihhibitingi) 

15 activity or m^ induce production of otiiercytoldnies in certain cdl populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting sudi attributes. Many 
protein fectors discovered to date, including all known cytokines, have exhibited activity in one 
or more fector-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. TTie activity of therapeutic compositions of the present 

20 mvention is evidenced by any one of a number of routine fect<xr dependent cell proliferation 
assays for cell Imes including, without limitation, 32D, DA2. DAIG, TIO, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2. CTLL2, TF-U Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-ceU or thymocyte proliferation include without limitation those desraribed 

25 m: CuiTCnt Protocols in Immunology, Ed by J. E. CoUgan, A. M. Krdsbeek, D. H. Margulies, E. 
M. Shevach, W. Stxober, Pub. Greene Publishmg Associates and Wiley-Interscience (Ch^ter 3, 
/w Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Charter 7, Immunologic stiidies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; BertagnoUi et al., J. Immunol. 
145:1706-1712, 1990; BertagnoUi et al.. Cellular Immunology 133:327-341, 1991; BertagnoUi, 

30 et al., I. Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of splerai cells, lymph node ceUs or 
thymocytes include, witiiout limitation, fliose described in: Polyclonal T ceU stimulation, 
Krdsbeek, A M. and Shevadi, E. M. In Current Protocols m Immunology. J. E. ca. CoUgan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John WUey and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, witho\it limitation, Ihose described in: Measurement of Human and Murine Interleukin 2 
5 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols m 

Immunology. J. E. e.a. CoUgan eds. Vol 1 pp. 6.3.1-6.3.12, John WUey and Sons, Toronto. 1991; 
deVriesetal., J.E3q).Med. 173:1205-1211, 1991; Moreauetal., Nature 336:690-692, 1988; 
Qieenberger et aL, Proc. Natl. Acad. Sd. U.SjV. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Currait Protocols in Immunology. J. E. Coligan eds. Vol 

10 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. NatL Aced. ScL 
U.S A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1-Bennett, F., Giaimotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. CoUgan eds. Vol 1 pp. 
6.15.1 John "VWley and Sons, Toronto. 1991; Measurement of mouse and human Meileukin 
9-Ciarletta, A., Giannotd, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 

15 J. E. Cohgan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (y/bidi will identify, among others, protems 
lhat affect APC-T cell interactions as weU as direct T-cell effects by measuring prolifwation and 
cytokine production) include, vwthout limitation, those described in: Current Protoc^ 
Iromunology, Ed by J. E. CoUgan, A. M. Kruisbeek, D. H. MarguUes, E. M. Shevach, W Strober, 

20 Pub. Greene PubUshing Associates and WUey-Uiterscience (Ch^ter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their ceUular receptore; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1:405-41 1, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 
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4.10^ STEM CELL GROWTH FACTOR ACTTVrrY 

A polypeptide of the present invention may exhibit stem ceU growth factor activity and 
be involved in the proUferadon, differentiation and survival of pluripotrait and totipotent stem 
cells including primordial germ ceUs, embryonic stem cells, hematopoietic stem cells and/or 

30 germ Une stem ceUs. Adrninistration of the polypeptide of the invention to stem ceUsi/iww or 
ex vivo is eigxicted to maintain and expand ceU populations in a totipotential or pluripotential 
state yMch would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufectme of bio-pharmaceuticals and the devdopment of bio-sensors. The abiUty to produce 
large quantities of human ceUs has unportant woridng appUcations for tiie production of human 

35 proteins which currentiy must be obtained from non-human sources or donors, unplantation of 
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cells to treat diseases sudi as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for graftog such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac musde), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

■ for transplantation sudi as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth fectors and/or cytokines may 

be administered in combination withtilie polypeptide of the invention to achieve the desired 
effect, including any of the growfli fectors listed herein, other stem cell maintenance fectors, and 
specifically including stem cell fector (SCF), leukemia inhibitory fector (LBF), Flt-3 Ugand (Flt- 
3L), any of the interleukins, recombinant soluble IL-6 recqrtor fused to IL-6, macrophage 

10 inflammatory protein 1-alpha (MlP-l-alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth fectors and basic fibroblast 
growth fector (bFGF). 

Since totipotent stem cells can give rise to virtually any mature ceU type, expansion of 
Ihese cells in culture will facilitate the production of large quantities of mature cells. Techniques 

15 for culturing stem cells are known in the art and administration ofpolypeptides of the invention, 

optionally with other growth fectors and/or cytokines, is expected to enhance the survival and 
proliferation of tiie stem cell populations. This can be accomphshed by direct administration of 
tiie polypeptide of the invention to tiie cultiire medium. Alternatively, stroma ceUs transfccted 

' with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem ceU populations in culture or in vivo. Stromal svppoTt cells for feeder layers 
may include embryonic bone manow fibroblasts, bone marrow stromal cells, fetal Uver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem ceUs tiiemselves can be transfccted witii a polynucleotide of the invention to induce 
autocrine cjqjressionofthe polypeptide of the invention. This wiU aUow for generation of 

25 undifferentiated totipotential/plmipotential stem ceU Imes ti>at are usefiil as is orlhat can then be 
differentiated into the desired mature ceU types. These stable cell lines can also serve as a source 
of undifferentiated totipoteritial/pluripotential mRNA to create cDNA hT>raries and templates for 
polymerase chain reaction experiments. These studies would aUow fi)r the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

30 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be usefiil in the 
treatinent of many pathological conditions. For example, polypeptides of tiie present mvention 
may be used to manipulate stem ceUs in culture to give rise to neuroepitiielial cells that can be 
used to augment or replace cells damaged by iUness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of tiie invention may be usefiil for mducing tiie proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, i.e. for ihe treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
tiie expanded stem ceU populations can also be genetically altered for gene therapy purposes and 
5 to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controUed differentiation of the stem cells into more differentiated cell 
types. A broadly appUcable meflwd of obtaining pure populations of a specific differentiated 
cdl type fiom undifferentiated stem cdl populations involves the use of a cell-type specific 
10 promoter driving a selectable marker. The selectable marker aUows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al.. Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al.. 
Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 
15 acconq)Hdied by cdturing the stem ceUs in the presence of a differentiation fectorsu^ 
letinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell fector activity and aUow differentiation to proceed 

7n vi/ro cultures of stem oeUs can be used to determine if the polypeptide of th^ 
exhibits stem cell growth fector activity. Stem ceOls are isolated from any one of various cell 
20 sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by ITiompson et al. Proc. Nafl. Acad. Sci, U.S A.. 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with othier growth 
fectors or cytokines. The ability of tiie polypeptide of the mvention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 
25 Bernstein etal.. Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTlVll If 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequentiy, in die treatment of myeloid or lymphoid cell disorders. Even marginal 

30 biological activity in support of colony forming cells or of factor-dependent ceU lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor ceUs alone or m combination with otiier cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with iiradiation/chemothfir^y 
to stimulate the production of eiyfliroid precursors and/or erythroid ceUs; in supporting fte 

35 growth and proliferation ofmyeloid cells sudi as granulocytes and monocytes/macrophages Ci.^^ 
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traditional CSF activity) useful, for exan^le. in conjunction wth chemotiierapy to prevent or 

neat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryoQrtes and consequently of platelets ihaeby aUowing prevention or treatment of 
various platelet disorders such as thrombocytopenia, and generally for use in place of or 
5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem ceUs which are capable of maturing to any and all of the above-mentioned 
hematopoietic ceUs and therefore find ther^eutic utility in various stem ceU disorders (such as 
tiiose usually treated with transplantation, including, without limitation, aplastic anemia and 
paroxysmal nocturnal hemoglobinuria), as weU as in repopulating the stem cefl compartme^ 

10 post iiradiation/chemotherapy, either in-vivo or ex-vm> (i.e., in conjunction with bone manow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene tiierapy . 

Thers9)eutic compositions of the invention can be used in the foUowing: 
Suitable assays for proliferation and differentiation of various hematopoietic tines are 
15 cited above. 

Assays for embryonic stem ceU differentiation (which will identify, among otiiers, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described m: Johansson et al. Cellular Biology 15:141-151, 1995; KeUer et al.. Molecular 
and CeUular Biology 13:473-486. 1993; McClanahan et al.. Blood 81 :2903-2915, 1993. 
20 Assays for stem ceU survival and differentiation (which will identify, among others, 

proteins tibat regulate lympho-hematopoiesis) include, without lunitation, those described m: 

Mefliylcelhilose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Ceils. R. I. 

Freshney. et al. eds. Vol pp. 265-268. WUey-Liss, Inc.. New Yoric, N.Y. 1994; Hirayama et al. 

Proc. Nati. Acad. Sci. USA 89:5907-5911, 1992; Primitive hdnatopoietic colony formmg cells 
25 with high proliferative potaitial, McNiece. I. K. and Briddell. R. A. In Culture of Hematopoietic 

CeUs. R. I. Freshney, et al. eds. Vol pp. 23-39. Wiley-Liss, Inc., New Yoric, N.Y. 1994; Neben et 
al.. Experimental Hematology 22:353-359, 1994; Cobblestone area formmg cell assay, 
Ptoemacher, R. E. In Culture of Hematopoietic Cells. R. 1. Freshn^r, et al. eds. Vol pp. 1-21, 
WUey-Liss, Inc., New York, N.Y. 1994; Long temi bone marrow cultures in tiie presence of 
30 stromal ceDs, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 

Freshney, et al. eds. Vol pp. 163-179, WUey-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating ceU assay, Sutiierland, H. J. In Culture of Hematopoietic CeUs. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc.. New Yoric, N.Y. 1994. 

35 4.10.6 ussuE GROWTH Acnvrry 



44 



wo 01757190 PCT/USOl/04098 
A polypeptide of the present invoition also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growfli or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
arcumstances where bone is not normally formed, has application in the healing of bone 
fiacturcs and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fiacture reduction and also in tiie improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to die repair 
of congenital, trauma induced, or oncologic resection induced cianiofecial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-fanning cells, 
stimulating giowfli of bon&-fomiing cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflannnatory processes may also be possible using the composition of ti» 
invention. 

Anotiier category of tissue regeneration activity that may invoWe the polypeptide of tiie 
present invention is tendonmgament formation. Induction of tendon/Ugament-like tissue or 
otiier tissue formation in circumstances where such tissue is not normally formed, has appUcation 
in die healing of tendon or Ugament tears, deformities and otiier tendon or Ugament defects in 
humans and otiier animals. Such a preparation employing a tendon/Kgament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or Ugament tissue, as well as 
use m die unproved fixation of tendon or Ugament to bone or otiier tissues, and in repairing 
defects to tendon or Ugament tissue. De novo tendonAigament-Uke tissue formation induced by 
a composition of the present invention contributes to tiie repair of congenital, trauma mduced, or 
otiier tendon or ligament defects of otiier origin, and is also usefiil in cosmetic plastic surgery for 
attachment or repair of tendons or Ugaments. The compositions of tiie present invention may 
provide environment to attract tendon- or Ugament-forming ceUs, stimulate growtii of tendon- or 
Kgament-foiming cells, induce differentiation of progenitors of tendon- or ligament-foiming 
ceUs, or mduce growfli of tendon/ligament ceUs or progenitors ex wvo for return m vrvo to effect 
tissue repair. TTie compositions of tiie invention may also be usefiil in flie treatment of tendinitis, 
carpal timnel syndrome and otiier tendon or Ugament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known m tiie art 
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The composiliotis of the present invention may also be useful fox proliferation of neural 

cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

nCTVous system diseases and neuropathies, as well as mechanical and traumatic disorders, whidi 

involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

5 composition may be used in the treatment of diseases of Ae peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parldnson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions vMch may be treated in 
accordance with the present invention include mechanical and traumatic disorders, such as spinal 

10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable usmg a 
composition of the invmtion. 

Compositions of the invention may also be useful to promote better or fester closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 

1 5 insufficiency, surgical and traxmiatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidnqr, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue; or for promoting the growth of cells comprising such tissues. Part of the 

20 desired effects may be by mhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of Ae present invention may also exhibit angiogenic activity. 

A conq)osition of the present mvention may also be useful for gut protection or 
regeneration and treatment of lung or liver fil)rosis, reperfusion mjuiy in various tissues, and 
conditions resulting from systemic cytokine damage. 

25 A composition ofthe present mvention may also be usefid for promoting or mhib 

differentiation of tissues described above from precursor tissues or ceUs; or for inhibiting the 

growth of tissues described above. 

Therapeutic compositions of the mvention can be used in the following: 
Assays for tissue generation activity include, witiiout limitation, those described m: 
30 International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 

Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 

WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation^ those described in: Winter, 

Epidennal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.). Year Book 
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Medical Publishers, Inc., Chicago, as modified by Eaglstdn and Mertz, J. Invest Dennatol 
71:382-84(1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 
5 A polypeptide of the present invention may also exhibit immune stimulating or immune 

suppressing activity, including without Umitation the activities for vAdoh assays are described 
herein. A polynucleotide ofthe invention can encode a polypeptide exMbiting such a^^ A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combmed immunodeficiency (SCID)), e.g., in regulating (up or down) grovrth and 

1 0 proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other ceU populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or iimgal infections, or may result fix)m autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fimgal or other infection may be 
treatable using a protein of the present invention, includii^ infections by HIV, hepatitis viruses, 

15 herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fimgal infections such 
as candidiasis. Of course, in this regard, proteins of the present mvention may also be useful 
vsdiere a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 

Autoimmxme disorders which may be treated using a protein of the present inventi^^ 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune puhnonary inflammation, Guillain-Barre syndrome, 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, grafl-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagoiiiste thereof 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (eg., anaphylaxis, serum sickness, drug reactions, food allergies, insect 

25 venom allergies, mastocytosis, allergic rhmitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which unmune 

30 suppression is desired (mcluding, for example, organ transplantation), may also be treatable 
using a protein (or antagonists tiiereof) of the present inventiorL The therapeutic effects of the 
polypeptides or antagonists tiiereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al.. Toxicology 125: 59-66, 
1998), skin prick test (Hofftnann et al., Allergy 54: 446-54, 1999), gumea pig skin sensitization 
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test (Vohr et al.. Arch. Toxoool. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. ToxicoL Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be m the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing flie induction of an munune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent Tolerance, which involves inducing non-responsiveness or anergy 

10 in T ceils, is distinguishable torn immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased Operadonally, tolerance can be 
demonstrated by the lack of a T cell response vpon Tee3q)osure to specific antigen in the absence 
of the tolerizutig agent 

Down regulating or preventing one or more antigen fimctions (including without 

1 5 Iknitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T ceUs, will be useful in situations of tissue, skin and 
organ transplantation and in grafl-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
tranqjlants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

20 followed by an immune reaction that destroys the transplant The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 

25 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The ejBBcacy of particular ther^eutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used mclude allogeneic cardiac grafts in 

30 mts and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al.. Science 257:789-792 (1992) and Turka et al., Proc. Nati. Acad. Sci USA, 89:11 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed.. Fundamental 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

35 compositions of flie invention on the development of that disease. 
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Bloddng antigen function mdsy also be therapeutically useful for treating autoimmime 
diseases. Many autoinunune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and vMch promote the production of cytokines and autoantibodies 
involved in the pathology of ttie diseases. Preventing the activation of autoreactive T cells may 
5 reduce or eliminate disease symptoms- Administration of reagmts which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blockuig 
reagents may induce antigen-specific tolerance of autoreactive T cells \^^ch could lead to 
long-term relief fi^om the disease. The efficacy of blocking reagents in preventing or alleviating 

10 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis m MRL/lpr/lpr mice or NZB hybrid mice, murine autounmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed. Fundamental Immunology, Raven Pxess, New York, 1989, pp. 

15 840-856). 

Upiegulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancmg an existing nnmune response or eliciting an initial 
inflnune response. For example, enhancing an immune response may be useful in cases of viral 

20 infection, including systemic viral diseases such as mfluenza, the common cold, and encephalitis. 
Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells fiom the patient, costimulating the T cells in vitro with viral antiigen-pulsed 
APCs either expiessuig a peptide of the present invention or together with a stunulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 

25 patient Anoflier mefliod of enhancmg anti-viral immune responses would be to isolate infected 
cells fi-om a patient, transfect them with a nucleic acid encodmg a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on then: 
surface, and remtroduce the transfected cells into tiie patient The mfected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells m vivo. 

30 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor ceDs which lack MHC class I or MHC class n molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class H molecules, can be transfected with 
nucleic add encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

35 MHC class I alpha chain protein and microglobulin protein or an MHC cla^ 
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protein and an MHC class n beta diain protein to thereby express MHC class I or MHC class n 
proteins on tiie cell surface. E:q>ression of the ^qwopriate class I or class 11 MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against tibe transfected tumor cell. Optionally, a g«ie encoding 
5 an antisenseconslTuct\!»bichblo<^ejq)ressionofanl^C class n associate 

the invariant chain, can also be cotransfected with a DNA encoding a pqjtide having &e activi^ 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, flie induction of a T cell mediated unmune response in a hunan 
subject mgy be sufiBcient to overcome tumor-specific tolerance in tiie subject 
10 ITie activity of a protein of the invention may, among otiier means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. MarguUes, E. M. Shevach, W. Strober, Pub. Greene PubUshing Associates and 
15 Wley-hrteredence (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Cbaptm 7, Immunologjc studies in Humans); Herrmann et al., Proc. Nad. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
• Immunol. 135:15$4-1572, 1985; Takai et al.. I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61 :1992-1998; BertagnoUi et al., 
20 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependait immunoglobulm responses and isotype switching (v^ch 
will identify, among ofliras, proteins Aat modulate T-cell dependent antibody responses and tiiat 
affect Thl/Th2 proffles) mchide, without limitation, tiiose described in: Maliszewdd, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell fimction: In vitro antibody production, 
25 Mond, J. J. and Brunswick, M. In Current Protocols m Immunology. J. B. e.a, Cotigan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominanfly Thl and CTL responses) mchide, without limitation, tiiose described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Knrisbeek, D. H. Margulies, E. 

» 

30 M. Shevach, W. Strober, Pub. Greene PubUshing Associates and Wiley-Interscience (Ch^?ter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. ImmunoL 140:508-512, 
1988; BertagnoUi et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic ceU-dependent assays (which will identify, among others, proteins expressed by 

35 dendritic cells that activate naive T-cells) include, without limitation, tiiose desaibed m: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et al.. Journal of Experimental Medidne 
173:549-559, 1991; Macatonia et al.. Journal of Immunology 154:5071-5079, 1995; Porgador et 
al.. Journal of Experimental Medicine 1 82:255-260, 1 995; Nair et al.. Journal of Virology 
67:4062-4069, 1993; Huang et al.. Science 264:961-965, 1994; Macatonia et al.. Journal of 
5 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al.. Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al.. Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent ^ptosis after siq)erantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, ftose described in: Darzynldewicz et al.. Cytometry 
10 13:795-808, 1992; Gorczyca et al.. Leukemia 7:659-670, 1993; CJorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et aL, CeU 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al.. Cytometry 14:891-897, 1993; Gorczyca et al.. International 
Journal of Oncology 1 :639-648, 1992. 

Assays for protems that influence early steps of T-ceU commitment and development 
15 indude,wiflioutliniitation,thosedescribedin:Anticaetal.,Blood84:lll-117,1994;Fu^ 
Cellular Immunology 155:111-122, 1994; Galyetal., Blood 85:2770-2778, 1995; Told et al., 
Proc. NaL Acad Sci. USA 88:7548-7551, 1991. 

4,10.8 ACTTVlN/INHroiNACnVITy 

20 A polypeptide of the present invention may also exhibit activin- or inhiWn-related 

activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their abiUly to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 

25 alone or in heterodimer^ with a member of the inhibin family, may be useful as a contraceptive 
based on the abiUty of inhibms to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of suflBcient amounts of other inhibins can 
induce infertihty in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as aheterodimer with other protein subunits of the inhibin group, may be useful as 

30 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertihty in 
sexually hnmature mammals, so as to increase the lifethne reproductive performance of domestic 
animals sudi as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 

the following mefliods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al.. Endocrinology 91:562-572, 1972; Ling et al.. Nature 321:779-782, 1986; Vale et al.. Nature 
5 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sd. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTiaCHEMOiaNETIC ACnvrrY 

A polypqptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epilheUal and/or endotheUal cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
actioa Chemotactic or diemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

15 modulators of the invention) provide particular advantages in treatment of wounds and otiier 
trauma to tissues, as weU as in treatment of localized infections. Fot example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent 

A protein or peptide has chemotactic activity for a particular cell population if it can 

20 stimulate, direcfly or indirectiy, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has tiie ability to direcfly stimulate directed movement of cells. 
Whether a particular protem has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (\^*dch will identify proteins tiiat induce or prevent 

chemotaxis) consist of assays that measure the ability of a proteui to induce flje migration of ceUs 
across a membrane as well as tiie ability of a protein to induce tiie adhesion of one cell 
population to anotiier cell population. Suitable assays for movement and adhesion mclude, 
witiiout limitation, tiiose described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; MuUer et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol 153:1762-1768, 1994. 
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4.10.10 HEMOSTATIC AND THROMBOLYHC ACnVHY 

A polypeptide of Ihe invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can mcode a polypeptide exhibiting such 
attributes. Compositions may be useful in tteatinent of various coagulation disorders (including 

5 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of fliromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and centiral nervous system vessels (e.g., stroke). 

10 Therapeutic compositions of the invention can be used in tiie following: 

Assay for hemostatic and tiirombolytic activity include, witiiout limitation, those 
described in: Linet et aL, J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413^19, 1987; Humphrey et al.. Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474,1988. 

15 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer ceU generation, proliferation or 
metastasis. Detection of tiie presence or amount of polynucleotides or polypeptides of ti»e 
invention may be useful for tiie diagnosis and/or prognosis of one or more types of cancer. For 

20 example, tiie presence or increased expression of a polynucleotide^lypeptide of tiie invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing maKgnancy. 
Conversely, a defect in tiie gene or absence of tiie polypeptide inay be assodated witii a cancer 
condition. Identification of single nucleotide polymorphisms associated witii cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

25 Cancer treatinents promote tumor regression by inhibiting tiimor ceU proliferation, 

inhibiting angiogenesis (growfli of new blood vessels tiiat is necessary to support tinnor growfli) 
and/or prohibiting metastasis by reducing tumor cell motiKty or invasiveness. Ther^ieutic 
compositions of tiie invention may be effective in adult and pediatric oncology including in soUd 
phase tumors/maHgnancies, locally advanced tumors, humau soft tissue sarcomas, metastatic 

30 cancer, including lymphatic metastases, blood ceU malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including moutii cancer, 
larynx cancer and tiiyroid cancer, lung cancers including small ceU carcinoma and non-small cell 
cancers, breast cancers including small ceU carcinoma and ductal carcinoma, gastixiintestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

35 assodated witii colorectal neoplasia, pancreatic cancers, Uver cancer, urologic cancers including 
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bladdCT cancer and prostate cancer, malignancies of the female gemtal tract mcludmg ovanan 
carcinoma, uterine (including endometiial) cancers, and solid tumor in the ovarian foUicle, 
kidney cancers including renal ceU carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gUomas, metastatic tumor ceU invasion in the central 
5 nervous system, bone cancers including osteomas, skin cancers including malignant mdanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal ceU carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
10 administered to treat cancer. Therqjeutic compositions can be administered in Aerapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemother^y. radiotherapy, theraiotherapy, and laser therapy, and may provide a beneficial 
effect, e,g. reducmg tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
in^oving overall clinical condition, witiiout necessarily eradicating the cancer. 
15 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktaiL An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of tiie mvention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable cama for delivery. The use of auti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are weD known in the art and can be used as a treatment in combination 
20 with flie polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
Asparaginase, Bleomycin, Busulfen, Caiboplatin, Caimostine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclqphoqjhamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazlne, Dactinomycin, 
DaunoruHcin HO, Doxorubicin HQ, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine. 5-FhjorouracU (5-Fu), Flutamide, Hydroxyurea (hydroxycaibamide), Ifosfemide, 
25 Interferon A^ha-2a, Interferon Alpha-2b, LeuproUde acetate (LHRH-ieleasing fector analog), 
Lomustine, Mechlorethamine HCl (nitrogen mustard), Mdphalan, Mercaptopurine, Mesna, 
Methotrexate (N4TX), NBtomycin, Mitoxantrone HCl, Octreotide, PUcamycin, Procarbazine HCl, 
Streptozodn, Tamoxifen citrate, Thioguanine, Thiotqja, Vinblastine sulfete. Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethyhnelamine, Interleukin-2, Mitoguazone, Pentostatin. 
30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
tieatinent of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therqieutically 
35 effective doses of the polypeptide of the mvention to reduce the risk of developing cancers. 
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Jn vitro models can be used to detennine the. eflEective doses of the polypeptide of the 
invention as a potential cancer treatment These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor ceUs in soft agar (see Fieshney, (1987) Culture of 
Animal CeUs: A Manual of Basic Technique, Wily-Liss,New Yoik,NY Ch 18 and Ch21), 

5 tumor systems in nude mice as described in GiovaneUa et al., J. NatL Can. Inst, 52: 921-30 
(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as desocibed in 
Pilkington et al.. Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of yasculaiization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Inti. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

10 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e-g. fiom American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTORflLIGANDACnVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
15 receptor Ugand or inhibitor or agonist of receptor/Hgand interactions. A polynucleotide of the 
invention can encode a polypeptide exMbiting such characteristics. Examples of such receptors 
and ligands inchide. witiiout limitation, cytokine receptors and tiieir ligands, receptor kinases and 
; ;thar ligands, rector phosphatases and their ligands, receptors involved in cell-ceU interactions 
and Iheir Hgands (including without limitation, cellular adhesion molecules (such as selectins, 
20 integrins and tiieir hgands) and receptor/Ugand pairs involved in antigen presentation, antigen 
recognition and development ofceUular and humoral immune responses. Receptors and Uganda 
are also usefiil for screening of potential pqptide or small molecule inhihitois of the relevant 
receptor/ligand interaction. A protrin of tiie present invention fmcluding, without limitation, 
fragments of receptors and Ugands) may tiiemselves be useful as mhibitors of receptor/Ugand 
25 interactions. 

The activity of a polypeptide of tiie invention may, among otiia- means, be measured by 

the following methods: 

Suitable assays for receptor-ligand activity include witiiout limitation tiiose described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbedc, D. R MaiguKes, E. M. 
30 Shevach, W. Sti-ober, Pub. Greene Pubh'shing Associates and Wiley- Interscience (Chaptra: 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc 
Nafl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stohenborg et al., J. Immunol. Metiiods 
175:59-68, 1994; Stittet al., CeU 80:661-670, 1995. 
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By way of example, the polypeptides of Ihe invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays^ aflBuoity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods, ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Bizymology Vol. 182 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or riiodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ridrL 

IS 4.10.13 DRUG SCKEETONG 

This invention is particularly useful for scre^iing chemical compounds by using the 
novel polypeptides or binding fiagments thereof in any of a variety of drug screening techniques. 
. The polypeptides or fi:agments employed in such a test may either be fi:ee in solution, afBxed to a 
solid support, borne on a cell smfece or located intracellularly . One method of drug screening 
20 utili23es eukaryotic or prokaryotic host cells \^ch are stably transformed with recombinant 

nucleic acids expressing the polypeptide or a fiagment thereof. Drugs are screened agamst such 
transformed cells in competitive binding assays. Such cells, either m viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent bemg tested or examine the 
25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are weU known in the art 

Sources for test compounds that may be screened for ability to bmd to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical Ubraries may be readily synthesized or purchased firom a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or 'leads" via natural product screemng. 

The sources of natural product Ubraries are microorganisms (including bacteria and 
35 fungi), animals, plants or other vegetation, or inarine organisms, and Ubrari 
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screening may be created by: (1) fermentation and extraction of broths firom soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 2«2:63-68 (1998). 
5 Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

organic compoimds and can be readily prepared by traditional automated synthesis methods, 
PGR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 

10 For a review of combinatorial chemistry and libraries created therefix)m, see Myers, Curr, Opirt 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Domer et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 
Identification of modulators through use of the various libraries described herein permits 

15 modification of the candidate "hif ' (or "lead") to optimize the capacity of the 'Thif to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In briei^ the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

20 The binding molecules thus identified may be complexed with toxins, e.g,, ricin or 

cholera, or witii other compounds that are toxic to cells sudhi as radioisotopes. The toxin-biading 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

25 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown bmding partners for receptor polypeptides of the invention. For example, 

30 expression cloning using mammalian or bacterial cells, or dihybrid screenmg assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 

35 that modulate (i.e:, increase or decrease) biological activity of a polypeptide of the mvention. 
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Ligands for receptcMr polypeptides of the inventioa can also be identified by adding exogenous 
lig^nds, or cocktails of Ugands to two ceUs populations that are geneticaUy identical except for 
the expression of Ae receptor of the invention: one cell population expresses the receptor of tiie 
invention vAiaeas the other does not The response of tiie two ceU populations to the addition of 
5 ligands(s) are then compared. AHemativdy, an expression Ubtaiy can be co-ejqpressed with the 
polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
m the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product Ubraries, and (3) combinatorial Ubraries 

10 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intraceUuIar signaling molecules in the signaling cascade of the 
polypeptide of the invention can be detemiined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose Kgand has been identified, is produced in a host ceU. The ceU is then, incubated 

15 with the Ugand specific for the extracellular portion of the chimeric protein, tbereby activating 
the chimeric recqrtor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phpsphoiylation. Other mefliods known to tiiose in the 
art can also be used to identify signaling molecules involved in receptor activity, 

20 4J0J15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providuig a stimulus to cells involved in tiie 
inflammatory response, by inhibiting or promoting cell-ceU mteractions (such as, for exanqsle, 
ceU adhesion), by inhibiting or promotiiig (Aemotaxis of cells involved in the inflammat^ 

25 process, inhibiting or promoting ceU extravasation, or by stimulating or siqjpressing production 
of other factors which more direcfly inhibit or promote an inflanomatoiy response. Compositions 
with such activities can be used to treat inflammatory conditions includmg chronic or acute 
conditions), including witiiout limitation intimation associated witii mfection (such as septic 
shock, sepsis or systemic inflamtoatory response syndrome (SIRS)), ischemia-reperfusion mjury, 

30 endotoxin lefliality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting firom 
. overproduction of cytokines such as TNF or IL-l- Compositions of the invention may also be 
use&l to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of tiiis invention may be utilized to prevent or treat conditions such as, but not 

35 limited to, sq)sis, acute panoeatitis, endotoxin shock, cytokine induced sho(^rheun^ 
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arthritis, chionic inflammatory artibiitis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatoiy bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
5 intrauterine infections. 

4-10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
10 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erytfaroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicme, 2d Ed., 33. Lippincott Co!, Philadelphia). 

15 

4.10,17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 

20 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (mcluding spinal cord, brain) or peripheral 

25 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions yMch sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
30 results m neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

35 tuberculosis, syphilis; 
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(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a resuh of a degenerative process including but not limited to degeneration associated 
with Paddnsorfs disease, Alzheimer's disease, Huntington's diorea, or amyotrophic lateral 
sclerosis; 

(v) lesions associated vvith nutritional diseases or disord^, in which a portion of the 
nervous system is destroyed or mjured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not limited to 
diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neuxotoTons; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demy elinating disease including but not limited to multiple sclerosis, human 
hnmunodeficiency virus-associated myelopatiiy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and CKitial pontine myelinolysis. 

Therapeutics which are useful according to die invention for treatment of a nervous 
system disorder may be selected by testing for biological activity m promoting the survival or 
diflFerentiatipn of neurons. For example, and not by way of limitation, tiierapeutics which elicit 
any of tiie foUowing effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) inareased sprouting ofneurons in culture or in vivo; 

(iii) increased production of a neuron-associated molecule m culture or in vivo, e.g. , 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known ia the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 
forth in Arakawa et al. (1990, J. Neurosd. 10:3507-3515); mcreased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, en2ymatic assay, antibody binding. Northern blot assay, etc, 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
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assessmg the physical manifestation of motor neuron disoider, e.^., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invOTtion include but are not limited to disorders such as infarction, infection, exposure to toxin, 
5 trauma, surgical damage, degenerative disease or malignancy that may a£fect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclax>sis^ and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral scl^sis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
10 poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACii Vl'llES 

A polypeptide of the invention may also exhibit one or more of the following additional 

1 5 activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fet to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminutioii, change in bone form or shape); 

20 effecting biorhythms or circadian cycles or rhytiuns; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-fectors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

25 (including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting diffa:entiation and growth of embryonic stem cells in lineages oibsst 
than hematopoietic lineages; hormonal or endocrine activity; in the case ofeozymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 

30 as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive witti such protein. 

4.10.19 roENTmCATION OF POLYMORPHISMS 
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The demonstratioii of polymoiphisms makes possible the identification of such 
polymoiphisms in human subjects and tiie phannacogenetic use of this infonnation for diagnosis 
and treatment. Such polymoiphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or ininnipf> 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or tiierapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease -makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art vAnoh all 
generally involve obtaining a sample fiom a patient, analyzing DNA &om the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PGR may be used to amplify an appropriate fi:agment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction firagment Iragth polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays wifli nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present mvention m order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes fi*om tiiose sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corre^onding change in amino acid sequence of tiie protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARITIMTIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The e^qperimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at, 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int Arch. Allergy AppL Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradennally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CPA). The 
route of injection can vary, but rats m^ be injected at the base ofthe tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administaring PBS only. 
5 The procedure for testing the effects ofthe test compound would consist of intradermally 

injecting killed Mycobacterium tuberculosis in CFA foUowed by immediately adnunistering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overaU arthritis score may be obtained as 
described by J. Holoskitz above. An analysis ofthe data would reveal that the test compound 
10 would have a dramatic affect on the swelling of the joints as measured by a decrease ofthe 
ar&ritis score. 

4.11 THERAPEUTIC METHODS 

The compoMtions including polypeptide fiagments, analogs, variants and antibodies or 
15 other binding partners or modulators including antisense polynucleotides) ofthe invention have 
numerous plications in a variety of thenq)eutic methods. Examples of therapeutic applications 
mclude, but are not limited to, those exenqplified herein. 

4.11.1 EXAMPLE 

20 One embodiment ofthe invention is the administration of an effective amount ofthe 

polypeptides or other composition of ihe invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While flie mode of 
administiation is not particularly important, parenteral administration is preferred. An 

exemplary mode of administration is to deliver an intravenous bolus. The dosage ofthe 
25 polypeptides or other composition ofthe invention will normally be determined by die 

prescribing physician. It is to be expected that flie dosage wiU vary according to tiie age, weight, 
condition and response of tiie individual patient Typically, the amount of polypeptide 
administered per dose will be in tiie range of about O.Ol^g/kg to 100 mg/kg of body weight, witii 
flie preferred dose being about O.l^g/kg to 10 mg/kg of patient body weight For parenteral 
30 administration, polypeptides of flie invention will be formulated in an iigectable fonn combined 
wifli a phannaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline. Ringer's solution, dextrose solution, and solutions consisting 
ofsmall amounts ofthe human serum albumm. The vehicle may contain minor amounts of 
additives tiiat maintain the isotonicity and stability of flie polypeptide or oflier active ingredient 
35 The preparation ofsuchsplutiras is within the skill ofthe art 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or otiier composition of the present invention (fix>m v^atever source derived, 

5 including without limitation from recombinant and non-recombinant sources and including 
antibodies and other bmding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions vAitte it is mixed with suitable 
caiiiere or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art The term 
''phaimaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on tiie route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic fectors such as 

15 M-CSF, GM-CSF, TOF, IL-l, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, lL-12, 
IL-13, IL-1^, IL-15, IFN, TNFO, TNFl, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem ceU 
factor, and erythropoietin. In fiirflier compositions, protems of tiie mvention may be combined 
with othea: agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growtti fector (EOF), platelet-derived growth 

20 factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described hereiiL 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protem or other active ingredient or complement its activity or use in 
treatment Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protem or otiier active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic fector, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
BL-lRa, IL-l Hyl, IL-l Hy2, anti-TNF, corticosteroids, immunosiq)pressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the mvention in such multimeric or complexed form. 
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As an alternative to bring included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agait may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
5 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Pubhshing Co., Easton, PA, latest 
edition. A flierapeutically effective dose further refers to that amount of the compound sufBcient 
to result in amelioration of symptoms, e.g., treatinent, healmg, prevention or amelioration of the 
rdevant medical condition, or an increase in rate of treatment, healing, prevention or 
10 ameUoration of such conditions. Whrai applied to an individual active ingredient, administered 
alone, a thei^)eutically effective dose refers to that ingredient alone. When applied to a 
combination, a flierapeutically effective dose refers to con^bined amounts of the active 
ingredients that result in the tberq)eutic effect wteflier administered in combination, serially or 
simultaneously. 

15 In practidng Ihe method of treatment or use of tbe present invention, a ther^utically 

^ective amount of protein or other active ingredient of Ae present invention is administered to 
a mammal having a Mndition to be treated. Protem or other active ingredient of the present 
invention may be administered m accordance with tiie method of the mvention either alone or in 
combination with other ther^es such as treatments anploying cytokines, lymphokines or other 

20 hematopoietic fectors. When co- administered with one or more cytokines, lymphokines or othw 
heihatopoietic factors, protein or other active mgredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or otiier 

25 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic fector(s), thrombolytic or anti-thrombotic fectors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
30 intestinal administiation; parenteral delivery, includmg intramuscular, subcutaneous, 
intramedullary iqections, as well as intrafliecal, direct intraventricular, intravoious, 
irtraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of flie present invention used in tiie pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical qjplication or cutaneous, subcutaneous, inta:^)eritoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather tiian systemic manner, for 
example, via injection of the compound directiy into a arthritic joints or in fibrotic tissue, often in 

5 a depot or sustained release formulation. In order to prevent the scarring process frequentiy 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. TheliposbmeswiUbetargetedtoandtakenupselectively by the 

10 afOicted tissue. 

The polypeptides of the mvention are administered by any route that delivers an effective 
dosage to the dedred site of action. The detenmnation of a suitable route of administration and 
an eflfective dosage for a particular mdication is within the level of skill in the art Preferably for 
wound treatment, one administers the thenqpeutic compound dkecfly to the site. Suitable dosage 
1 5 ranges for the polypeptides of the invention can be extispolated &om these dosages or firan 
similar studies in qqjTopriate animal models. Dosagescanthenbeadjustedasnecessary by the 
clinician to provide maximal tbar^)eutic benefit 

4.12^ COMPOSmONS/FORMDLATIONS • 

20 Pharmaceutical compositions for use in accordance wiA the present mvention thus m^ 

be fonnulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which fedUtate processing of the active conqwunds mto 
preparations which can be used pharmaceutically. These pharmaceutical con^sitions may be 
manufectured in a manner that is itself known, e.g., by means of conventional mixing, 

25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entr^ping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a Ihraapeutically effective amount of protem or other active mgredient of the present 
invention is administered orally, protein or other active mgredient of tiie present invention wiU 
be m the form of a tablet, capsule, powder, solution or elixir. When administered m tablet form, 

30 the pharmaceutical conqxjsition of the invention may additionaUy contain a solid carrier such as 
a gelatin or an adjuvant The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably fipom about 25 to 90% protein or 
otiier active mgredient of the present invention. When administered m liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant oiigm such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of tiie 
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phannaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains fipom about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably fiwm 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a flierapeutically effective amoimt of protein or otiier active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the fomi of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or otiier 

10 active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is witiiin 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
preseot invention, an isotonic vehicle sudi as Sodium Chloride Injection, Ringer's Injection, 
Dexbose Injection, Dextrose and Sodium CWoride Injection, Lactated Ringer's Injection, or 

15 other vehicle as known in the art- The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous sohitions, 
preferably in physiologically compatible buffers such as Hanks's solution. Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appn^ate to the 

20 barrier to be permeated are used in tiiefoimulationu Such penetrants are generally known in the 
art 

For oral admimstiation, the compounds can be formulated readily by combining the 
active compounds with pharmaceuticaUy acceptable carriers well known m the art Such carriers 
enable flie compounds of flie invention to be formulated as tablets, pills, dragees, capsules, 
25 Uquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by apatientto be 

treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionaUy grinding a resulting nuxture, and processing tiie mfarture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable exdpients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 

30 preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, metiiyl cellulose, hydroxypropybnethyl-cellulose, sodium 
carboxymefliylceUulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
fliereofsuch as sodium alginate. Dragee cores are provided with suitable coatings. For this 

35 purpose, concentrated sugar solutions may be used, which may optionaUy contain gum arabic, 
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talc, polyvinyl pyirolidone, carbopol gel. polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestufis or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 
5 Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as weU as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
10 suitable Uqoids, such as fetty oils, Uquid parafBn, or liquid polyethylene glycols. In addition, 
stabilizers may be added, AU formulations for oral administration should be in dosages suitable 
far such administration. For buccal administration, flie compositions may take the form of 
tablets or lozenges formulated in convoitional manner. 

For administration by inhalation, the compounds for use accordmg to 4e present 
15 invention are conveniently delivered in the form of an aerosol spray presentation firom 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluorometiiane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deUver a metered amount Capsules arid cartridges of, e.g. , gelatin for use in 
20 an inhaler or insufflator may be formulated containing a powder mix of tiie compound and a 

suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infiision. Formulations fiw 
injection may be presented in unit dosage form, e.g., m ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
25 emulsions in oUy or aqueous vehicles, and may contain foimulatory agents such as suspending, 
stabili2ang and/or disposing agents. 

Phannaceutical formulations for pareiiteral administration include aqueous solutions of 
tiie active compQimds m water-soluble fomi. Additionally, suspensions of the active compounds 
may be prepared as appropriate oUy injection suspendons. Suitable UpophiHc solvents or 
30 vehicles mdude fatty oils such as sesame oil, or syntiietic fetty acid esters, such as ethyl oleate or 
triglycerides, or Uposomes. Aqueous injection suspensions may contam substances which 
mcrease fbe viscosity of the suspension, sudi as sodium carboxymefliyl cellulose, sorbitol, or 
dextian. OptionaUy, the suspension may also contain suitable stabilizers or ^ents v*ich 
increase tiie solubility of the compounds to allow for tiie preparation of highly concentiated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the foraiulations described previously, tiie compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfectant, a water-miscible organic polymer, and 
an aqueous phase. TTie co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% vffy benzyl alcohol, 8% w/v of the nor^lar surfactant polysorbate 80, and 65% w/v 
polyefliylene glycol 300, made iq) to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. NaturaUy, the proportions of a co-solvent system inay be varied considerably 
without destroying its soluHUty and toxidly characteristics. Furthermore, tiie identity of tiie 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfectants may 
be used instead of polysorbate 80; the ftaction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol. e.g. polyvinyl pyrxoKdone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other deKvery. systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions arc weU 
known examples of deUvery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usuaUy at the cost of greater toxicity. 
Additionally, flie compounds may be deUvered using a sustained-release system, such as 
semipermeable matrices of soUd hydrophobic polymers containing tiie tiierapeutic agent 
Various types of sustained-release materials have been estabUshed and are well known by tiiose 
skilled in the art Sustained-ielease capsules may, depending on their chemical nature, release tije 
compounds forafewweeksupto over lOOdays. Dq>ending on tiie chemical nature and the 
biological stabihty of tiie therapeutic reagent, additional strategies for protein or otiier active 
mgredient stabilization may be enq>loyed. 

The pharmaceutical compositions also may comprise suitable soUd or gel phase carriers 
or excipients. Examples of such carriers or excipients mclude but are not limited to calcium 
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caitoS^ caldum.phosphate, various sugars, starches, ceUulose derivatives, gelatin, and 
polymers 'such as polyethylene glycols. Many of the active ingredients of the invention may he 
provided as salts with phannacenticallycon^atible counter ions. Such phannaceutically 
acceptable base addition sahs are those salts which retain the biological effectiveness and 
5 properties of the free adds and whidi are obtained by reaction wifli inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine. 
monoalkylamine, dibasic amino adds, sodium acetate, potassium benzoate, trietfaanol amine and 
flie like. 

nxe pharmaceutical composition of the invention may be in Ae form of a complex of the 
10 p^oteiD(s)orothe^activ6ingredient(s)ofpresentinventionalong^vithprotdn^^ 

antigens. TTie protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes wiUrespondtoantigentJm)ughthdrsurfeceimmunog^^^ 

receptor. T lymphocytes will respond to antigen through the T cell receptor CTCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins induding 
15 those encoded by class I and class U MHC genes on host cells will serve to present the peptide 
antigen(s)toTlymphocytes. Hie antigen components could also be suppUed as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directiy signal T cdls. 
Alternatively amiTxjdies able to bind surfece immunoglobulin and other molecdes on B 
as antibodies sdile to Irind the TCR and other molecules on T cdls can 

20 pharmaceutical composition of the invention. 

Hte pharmaceutical composition of the invention may be in fte form of a liposome in 

^di protdn of the present inveirtion is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipafliic agents such as Upids whidi exist in aggregated form as 
micdles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
25 Upids for Uposomd formulation include, without limitation, monoglycerides,diglyceri^^ 
sdfatides. lysoledthins, phospholipids, sqponin, bUe adds, and the like. Preparation of such 
liposomal fomiulations is within the levd of drill in the art, as disclosed, for exanq)le, in U.S. 
Patent Nos. 4,235,871 ; 4.501,728; 4.837,028; and 4,737,323, all of whidi are incorporated 
herein by reference. 

30 The amount of protein or other active ingredicart of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
tiie condition being treated, and on the nature of prior treatments whidi die patient has 
undergone. Ultimatdy. the attending physician will dedde the amount of protdn or other active 
ingredient of the present invention with which to treat each individual patient. InitiaUy. the 

35 attendingphyddanwiUadmimsterlowdosesofproteinorotheractiveingred^ 
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ix^vention and observe the patienf s xesponse. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased forther. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
5 contain about 0.01 to about 100 mg (preferably about 0.1 ^g to about 10 mg, more preferably 
about0.1 ^gtoaboutlmg)ofpioteiiiorolheractiveingredientofthepresentinventionperkg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or Ugament regeneration, the ther^tic method mcludes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
10 composition for use in tins invention is, of course, in a pyrogen-fiee. physiologically acceptable 
form Further.thecompodtionmaydesirablybeencapsulatedoriigectedinaviscou^ 
deUverytothesiteofbonccarlilageortissuedamage. Topical administration may be suitable 
forwoundhealingandtissuerepair. Iherapeutically useful ag^ts other than a protein or other 
active ingredient of the invention whidi may also optionally be included in the composition as 
15 described above, may dternatively or addftionaUy.be administered simultaneously o^ 

sequentially with the composition in ti.e methods of tiie invention. Preferably for bone and/or 
cartilage formation, tiie composition would include a matrix capable of deUvering tiie 
protein-containing or other active ingredient-contaiiung compositionto tire she of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
20 capableofbeingresorbedintothebody. Such matrices may be formed of materials presenfly m 

use for other implanted medical applications. 

•nie choice of matrix material is based on biocompatibiUty, biodegradability. mechanical 
properties, cosmetic appearance and interface properties. TTie particular appHcation of the 
compositions will define tiie appropriate formulation. Potential matiices for tiie compositions 

25 may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite. polylactic acid, polyglycoUc acid and polyanhydrides. Otiier potential matenals 
are biodegradable and biologically well-defined, such as bone or demial collagen. Further 
nmtrices are comprised ofpureprotdns or extraceUularmatiix components. Otiierpotential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass 

30 ahnninates.oroti«rceramics. Matricesmaybecomprisedof combinations of any of tiie above 
mentioned types of material, such as polylactic add and hydroxyapatite or collagen and 
tricalcium phosphate. The biocaamics may be altered in composition, such as in 
calcmm-aluminate-phosphate and processing to alter poie size, partide size, particle shape, and 
biodegradability. Presentiy preferred is a 50:50 (mole weight) copolymer of lactic add and 

35 glycolic acid in flie fomi of porous partides having diameters rangmg ftom 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent flie proton compositions fiom disassociating ftom 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylceUulose,ethylceUulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylceUulose, and 
carboxymelhylcellulcse, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Olher prefrared sequestering agents include hyaluronic acid, sodium alginate, 
poly(efliylene glycol), polyoxyethylene oxide, caiboxyvinyl polymer and poly(vinyl alcohol). 
10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, ^ch rq>resents the amount necessary to prevent desorption of the 
protein fiom the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented fiom infiltrating the ihatrix, tiiereby providiiig tiie 
proteinihe opportunity to assist the osteogenic activity ofihe progenitor cells. Infimher 
15 compositions, proteins or other active ingredients ofthe invention may be combined with otiier 

agents beneficial to flie treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents mclude various growth factors sudi as epidermal growttifector(EGF), 
platelet derived growth factor (PDGF), transfomung growth fectois (TGF-a and TGF-p), and 

insulin-like growth fector (IGF). 
20 The therq>eutic compositions are also presentiy valuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desured 
patients for such treatinent witii proteins or otiier active ingredients of tiie present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration wiU be determined by tiie attending physician considering various factors which 

25 modify tiie action of tiie proteins, e.g., amount of tissue weight desired to be formed, tiie site of 
damage, flie condition of tiie damaged tissue, tiie size of a wound, type of damaged tissue (e.g., 
bone), tiie patient's age, sex, and diet, tiie severity of any infection, time of administration and 
otiier clmical fectois. The dosage may vary witii tiie type of matrix used in tiie reconstitution and 
wifli inclusion of otiier proteins in tiie pharmaceutical composition. For example, tiie addition of 

30 otiier known growth fectois. such as IGF I (insulin like growtii factor D, to tiie final composition, 
may also effect tiie dosage. Progress can be monitored by periodic assessment of tissue/bone 
growfli and/or repair, for example. X-rays, histomoiphomehic determinations and tetracycline 
labeling. 

Polynucleotides of tiie present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced eitiier in vivo or ex vivo into cells for expression in a 
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mammalian subject Polynudeotides of the invention may also be administered by other knovna 
methods for introduction of nucleic acid into a ceU or organism Cmcluding, without limitation, in 
the form of viral vectors or naked DN A). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeuticaUy effective amount means an amount 
effective to prevent development of or to aUeviate the existing symptoms of the subject being 
treated. Determination of the effective amount is weU within the capability of those skiUed in 
the art, espedaUy in Ught of ti^ detailed disclosure provided heran. For any compound used m 
the metiiod of the invention, the thei^jeutically effective dose can be estimated initiaUy firom 
^q«opriale in vitro assays. For example, a dose can be formulated in animal models to achieve a 
diculating concentration i^ge that can be used to more accurately detemiine useful dos^ 

humans. For example, a dose can be formulated in annual models to achieve a circulating 
concentration range that includes flxe IC50 as determined in cell culture (i.e., the concentration of 
flie test compound which achieves a half-maximal inhibitiDn of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeuticaDy effective dose refers to that amount of tiie compound that results in 
amelioration of symptoms or aprolongafion of survival in apatient Toxidty and thempeutic 
efiBcacy of such compounds can be determined by standard pharmaceutical procedures in ceU 
cultures or experimental animals, e.g., for determining flie LD50 dose leflial to 50% of ti>e 
population) and the ED50 (tiie dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is tiie therapeutic mdex and it can be expressed as tiie 
ratio between LD50 and ED50. Compounds which exhibit high tiier^c indices are preferred. 
The data obtained from these ceU culture assays and animal studies can be used in foii^^ 
range of dosage for use in human. The dosage of such compounds Ues preferably within a range 
of circulating concentrations that include the ED50 with Uttle or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's pondition. See, e.g., Fingl et al., 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety vduch are sufficient to maintain the 



73 



WOOl/57190 PCT/USOl/04098 
desired eflFects, or mimmal effective concentration (MEC). Tbs MEC wiU vary for each 
compound but can be estimated fiom in vitro data. Dosages necessary to achieve the MEC wiU 
depend on individual characteristics and route of administintion. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, tiie effective local concentration of the drug may not be 
. idated to plasma concentration. 
10 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be m the range of about 0.01 \xs/kg to 100 mg/kg of body wd^ daily, with the prefened 
dose being about 0.1 jig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosmg may be once daUy, or equivalent doses may be delivered at longer or shorter 
intervals. 

1 5 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of flie prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device v*ich may 

contain one or more unit dosage forms containing the active ingredient The pack may, for 
exsmple, comprise metal or plastic foD, such as a bUster pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated m a compatible pharmaceutical carrier may also be prepared, placed in an 

25 ^jprqpriate contamer, and labeled for treatment of an indicated condition. 

4^3 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The tenn"aiitibody" as used herran refers to immunoglobulin molecules and 

30 mnnunologically active portions of immunoglobulin ([g) molecules, i.e., molecules that contain 
an antigen binding site that spedficaUy binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain. Fab, Fab- and F(abv 
fragments, and an F^ e3q)ression Ubraiy. In general, an antibody molecule obtained fiom 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which difBsr from one another 

35 by the nature ofthe heavy chain present in the molecule. Certain classes have subclasses as well. 
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such as IgGi, IgGz, and olhers. Furthennore, in humans, the Ught chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to aU such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecificaUy bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as unmunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the ftll lengfli proton, such as an ammo acid sequence shown in SEQ E) NO:985, and 
encompasses an epitope thereof such Aat an antibody raised against the peptide forms a specific 
immune complex with the foil length protem or with any fragment that contams the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
add residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
qntopes encompassed by flie antigenic peptide are regions of the protan fliat are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protem that is located on the surfece of the protein, eg., a 
hydrophilic region. A hydrophobidty analysis of the human related protein sequence wiU 
indicate which regions of a related protem are particularly hydrophilic and, therefore, are likdy 
to encode surfece residues usefiil for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophiUraty and hydrophobidty 
may be generated by any method well known in the art, including, foe example, the Kyte 
DooUttle or the Hopp Woods meftods, either with or without Fourier transformation. See, eg., 
Hopp and Woods, 1981,Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and DooHttie 1982, J. 
Mol. Biol. 157: 105-142, each of vvWch is incorporated herem by reference in its entirety. 
Antibodies that are specific for one or more domams within an antigenic protein, or derivatives, 
filaments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereat may be utilized as an immunogen in the generation of aitibodies that 
immunospedfically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed agamst a protein of the invention, or against d^ivatives, 
fragments, analogs homologs or orthologs thereof (see, for example. Antibodies: A Laboratory 
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Manual, Harlow E, and Lane D, 1988, Cold Spring Haitor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

For flie production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synlhetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic proton. Furthemaore, the proton may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not lunited to keyhole limpet hemocyanin, serum albumin, 
bovine fhyroglobulin, and soybean trypan inhflritor. The preparation can further mdude an 
adjuvant. Various adjuvants used to uicrease 4e inununological response niclude, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surfece 
active substances (e.g., lysoledthin, pluronic polyols, polyanions. peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Cahnette-Guerin and 
Coiynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
syntiietic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the unmunogenic proton can be 
isolated from iht mammal (e.g., from the blood) and frjriher purified by weU known techniques, 
such as afiBnity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is tiie 
target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoa£5nity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, foe, Philadelphia PA, VoL 14. No. 8 (April 17. 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibo<fy" (MAb) or "monoclonal antibody compodtion", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product In particular, the complementarity determining regions (CDRs) of tiie monoclonal 
antibody are identical in all the molecules of the population. MAbs tims contain an antigen 
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binding site c^le of immunoreacting with a particular epitope of the antigen chaiacterized by 
a unique binding afSnity for it 

■ Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and MOstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
5 hamster, or other q>propriate host animal, is typicaUy immunized with an iinmiiniziiig agent to 
ehdt lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immimi2dng agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent wiU typicaUy include the protdn antigen, a fiagment hereof or a fiision 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
10 are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fiised with an immortalized ceU line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma ceU (Coding, Monoclonal Antibodies: 
Princi ples and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transfonned mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
15 Usually, rat or mouse myeloma ceU lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, unmortalized cells. For example, if the parental ceUs lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hyhridomas typically will include hypoxanthine, aminopterih, and thymidine ("HAT 
20 medium"), which substances prevent the growth of HGPRT-deficient cells. 

Prefened immortalized ceD lines arc those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, whidi 
can be obtained, for instance, from the Salk histitute Cell Distribution Center, San Diego, 
25 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mousfr-human heteromydoma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, T TmTniitiol.. 13[3:3001 (1984); Brodeur et al., Monoc|ona| 
Antibody Production Techn ig iies and Apohcations. Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

30 The culture medium in which the hybridoma cells are cultured can Aen be assayed for 

the presence of monoclonal antibodies directed gainst flie antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 

35 art The binding afBnity of the monoclonal antibody can, for example, be determined by ±e 
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Ccatchnrd nnfllyfif ^^^f^^n?»" ""'^ Pn\UrA Anal Rinrhftm.. 107:220 (1980). Preferably, 
antibodies having a high degree of specificily and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

5 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified fiomthe culture 
medium or ascites fluid by conventional unmunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxylapatite chromatogr^hy, gel electrophoresis, dialysis, or 
afBnity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (eg., by usmg 

15 oligonucleotide probes that are capable of binding specifically to graes encoding the heavy and 
light chains of murine antibodies). The hybridoma ceUs of the invention serve as a prefored 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then tiansfected into host cells such as sunian COS cells, Oiinese hamster ovary (CHO) ceUs, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the syndesis of 

20 monoclonal antibodies in the recombinant host ceUs. The DNA also can be modified, for 

Kcample, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, NMure 368, 
812-13 (1994)) or. by covalently joining to the immunoglobulin coding sequence aU or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combimr^ site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5A32 Humanized Antibodies 
30 xiie antibodies directed against the protein amtigensofflie invention can fiirther comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by tiie human s^dnst the administer^ 
inmiunoglobulin. Humanized fonns of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 
35 binding subsequences of antibodies) that are principally comprised of tiie sequence of a human 
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immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be perfonned following the method of Winter and co-workers (Jones et al.. 
Nature. 321 :522-525 (1986); Riechmann et al.. Nature, 332:323-327 (1988); Verhoeyen et al.. 
Science. 239:1 534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

5 conesponding sequences ofa human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv fiamework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues vdiich are found nei&er 
in the ledpient antibody nor in the imported CDR or fiamework sequences. In general, the 
humanized antibody will comprise substantiaUy all of at least one, and typically two, variable 

10 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the fiameworic regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulm constant region (Fc), typicaUy that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Qmr. Qp, Struct. Bipl., 

15 2:593-596(1992)). 

S.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentiaUy the entire 
sequences of both the Ught chain and the heavy chain, including the CDRs, arise from human 

20 genes. Such antibodies are termed "humaa antibodies", or "fiilly human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-ceU 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 

25 antibodies may be utilized in the practice of the present mvention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Nad Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

30 including phage display libraries (Hoogenboom and Winter, J,Mo^PioU ^:381 (1991); 
Marks et al., J. Mol. Biol. 2^:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin lod into transgenic animals, e.g., mice in^cih the 
endogenous immunoglobulin genes have been partiaUy or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles tiiat seen in humans 

35 m all respects, including gene rearrangement, assembly, and antibody repertoire. This q>proach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633.425; 5.661.016, and in Marks et al. fBio/rechnologv 10, 779-783 (1992)); Lonberg et al. 
(ilStt^M'856-859(1994)); Momson(Natoem 812-13 (1994 
Rintechnologv l4. 845-51 (1996)); Neuberger (Nature BiotecbnoloRY 14. 826 (1996)); and 

Lonberg and Huszar nntem. Rev. Tnmnmol. 13 65-93 (1995)). 

Human antibodies may additEonafly be irodnced using transgemc nonhmn^ 

which are modified so as to produce fiiUy human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94702602). Tbe 
endogenous genes encoding the heavy and Kght immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and Ught chain immunoglobulins 
are inserted mto the host»s genome. TTie human genes are incorporated, for example, usmg yeast 
artificial chromosomes containingtherequisitehumanDNAsegments. An animal v»*ich 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fevsrer than the fidl complement of the modifications. ITie 
preferred embodiment of such a nonhuman animal is a mouse, and is teraied the Xenomouse™ 
as disclosed in PCT pubUcations WO 96/33735 and WO 96/34096. This animal produces B ceUs 
which secrete fiJly human immunoglobulins. The antibodies can be obtained direcfly fifom the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively fi^om immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins witii human variable regions can be recovered and expressed to obtain tiie 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a mefliod of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5.939,598. It can be obtained by a method including deleting die J segment genes &om at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
die deletion being eflfected by atargeting vector containing a gene encoding a selectable marker, 
and producing from the embryonic stem ceU attansgenic mouse whose somatic and gem cells 
contain the gene encoding the selectable marko:. 

A metiiod for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771 . It includes introducing an expression vector tiiat contams a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into anotiier 
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mammaUan host ceU, and fusing the two cells to fonn a hybrid cell. TTie hybrid cell expresses an 
antibody containing the heavy chain and the Ught chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen. and a correlative method for selecting an antibody that binds 
immunospedfically to the relevant epitope with high affinity, are disclosed in PCT pubUcation 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

Accordmg to the invention, tecilunques can be adqrted for the production of singles 
antibodies specific to an antigenic piotem of tiie invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fabejqjression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal Fab fragments witii tiie desired specificity for a protein or derivatives, fiagments, 
analogs or homologs tiiereof. Antibody fragments fliat contain the idiotypes to a protdn antigen 
may be produced by techniques known in the art including, but not lunited to: (i) an F(ab72 
fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated 
by reducing the disulfide bridges of an F(ab')2 fragment; (in) an Fab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) Fv fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies tiiat 
have bindmg specificities for at least two different antigens. Li tiie present case, one of tiie 
bindmg specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a ceU-surfece protein or receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art Traditionally, tiie 
recombinant production of bispecific antibodies is based on flie co-«qiression of two 
immunoglobulm heavy-diain/Ught-chain pairs, vHobk tiie two heavy chains have difBerent 
specificities (Milstein and CueUo, Nature. 305:537-539 (1983)). Because of tiie random 
assortment of immunoglobulin heavy and light chams, tiiese hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of vAich only one has flie correct 
bispecific structine. The purification oftiie correct molecule is usually acconqilished by afSnity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al., 1991 EMBQ J., 10:3655-3659. 

Antibody variable domains with tiie desired bmdmg specificities (antibody-antigen 
combining sites) can be fused to immunoglobuhn constant domain sequraices. The fusion 
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preferably is witii an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is prefened to have the first heavy^chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immvmoglobulin heavy-chain fiisions and, if desired, the immunoglobulin 
5 Ught chain, are inserted into separate expression vectors, and are co-tiansfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al.. Methods in Enzvmologv. 121:210 (1986). 

According to another ^proach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers v*ich are 
10 recovered fiom recombinant cell culture. The preferred interfece comprises at least a part of flie 
CH3 region of an antibod|y constant domain. In Ibis mdhod, one or more small amino acid side 
chains j&am the interfece of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosme or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interfece of the second antibody molecule by replacing large amino 
15 add side chains with smaller ones (e.g. alanine or threonine). This provides a mediamsm for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimets. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies fiom antibody 
fiagments have been described in the Uterature. For example, bispecific antibodies can be 
20 prepared using chemical linkage. Brennan et al.. Science 229:81 (1985) describe a procedure 
yrfberein intact antibodies are proteolytically cleaved to generate F(ab')2 firagments. These 
fragments are reduced in tiie presence of tiie ditiiiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to tiiionitrobenz»ate (TNB) derivatives. One of tiie Fab'-TNB 
25 derivatives is then reconverted to tiie Fab'-thiol by reduction witii mercaptoetiiylamine and is 
mixed with an equimolar amount of the otiier Fab'-TNB derivative to form tiie bispecific 
antibody. Hie bispecific antibodies produced can be used as agents for tiie selective 
immobilization of en^mes. 

Additionally, Fab' fragments can be directly recovered from E. coll and chemically 
30 coiqiled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
tiie production of a ftiUy humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 
was separately secreted from E. coU and sutjected to directed chemical coupling in vitro to form 
tiie bispecific antibody. The bispecific antibody thus formed was able to Wnd to cells 
overexpressiQg tiie ErbB2 receptor and normal human T ceUs, as weU as trigger tiie lytic activity 
35 of human cytotoxic lymphocytes against human breast tumor targets. 
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Various techniques for inaking and isolating bispecific antibody fragments directly from 
recombinant ceU culture have also been described. For example, bispecific antibodies have been 
produced uang leucine zippers. Kostehiy etal..Lta!inoL148(5):1547-1553 (1992). The 
leudne zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antiTwdies by gene fusion. The antibody homodimers were reduced at the hinge region 
toformmoaomers and ftenre-oxidizedto form theantibodyheterodhners. Ibis method can 
also be utilized for 4e production of antibody homodimers. Tlie "diabody" technology 
described by Bollinger et al.. iw M.tl A«.d Sci.USA 90:6444-6448 (1993) has provided an 
altemativemechanismformakingbispecificantibodyftagments. TTie fragments comprise a 
heavy-chain variable domain (Vh) comiected to a Ught-chain variable domain (VO by a linker 
Avhich is too short to allow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dhners has also been 
reported. See, Gruber et al., LtomimQL 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispedfic 
antibodies can be prepared. Tutt et al.. ,T, Immunol. 147:60 (1991). 
Exemplaiy bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively; an anti-antigenic ann of an 
immunoglobulin molecule can be combined with an aim which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRH (CD32) and FcyRDI (CD16) so as to focus cellular 
defend mechanisms to the ceU expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express aparticular antigen. Hiese antibodies 
possess an antigen-binding arm and an ami which binds a cytotoxic agent or a radionucKde 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds flie protein antigen described herein and fiirther binds tissue fector (JF). 

5.13.6 Hcteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
. No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089): 
It is contemplated that tiie antibodies can be prepared in vitro using known metiiods in synthetic 
protein chemistry, including those involving crosslmking agents. For example, immunotoxins 
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can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5 J3.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody m treating cancer. For example, cysteine 
residue(s) can be introduced mto the Fc region, thereby allowing interchain disulfide bond 
formation in this regioiL The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifimctional cross-linkers as described in Wolff 
etal. Cancer Research, 53:2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC cs^abilities. ; 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates • 

The invention also pertains to immimoconjugates comprising an antibody conjugated to a 
20 cytotoxic agent such as a chemother^utic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fimgal, plant, or animal origin, or firagments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents usefiil in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and firagments thereof that can be used include 

25 diphtheria A chain, nonbinding active fiagments of diphtheria toxin, exotoxin A chain (firom 
Pseudomonas aeruginosa), ridn A chain, abrin A chain, modeccin A chain, alpha-saicm, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteios (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sq)aonaria officiiudis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

30 radionuclides are available for the production of radioconjugated antibodies. Examples include 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifimctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifimctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-<fiazoi^^^^yl)-®^yl^^^^™^^)' diisocyanates (such as tolyaie 2,6-diisocyanate), 
and Ws-active jQuoiine compounds (such as l,5-difluoro-2,4-dimtrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al.. Science, 238: 1098 (1987). 
5 Carbon-14-labeled l-isotiiiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugatiionof radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargetmg v^erein the antibody-receptor conjugate is 
10 administered to the patient, followed by removal of unbound conjugate from the chxjulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

15 In one application of this embodiment, a nucleotide sequence of the present invention can 

be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 

20 and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presentiy known computer readable mediums can 
be used to create a manufecture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present inventioiL As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 

25 presentiy known methods for recording information on computer readable medium to generate 
manufectures comprising the nucleotide sequence information of the present mvention. 

A vari ety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 

30 to access tiie stored information. In addition, a variety ofdata processor programs and formats 
can be used to store the nucleotide sequence information of Ihe present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in tiie form of an ASCH file, stored in a database q)plication, such as DB2, Sybase, 

35 Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
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fonnats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fiagment thereof; or a nucleotide sequence at least 95% 
5 identical to any of Ae nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access Ihe sequence 
infonnation for a variety of purposes. Computer software is pubKdy available which aUows a 
skilled artisan to access sequence infomiation provided in a computfir readable medram. The 
examples v*ich follow demonstrate how software which implements the BLAST (Ahschul et 

10 si., J. Mol. Biol. 215:403-410 (1990)) and BLA2E (Bnitlag et al.. Comp. Chem. 17-.203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (OKFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useful metabolites. 

15 As used herein, "a computer-based system" refers to tiie hardware means, software 

means, and data storage means used to analyze flie nucleotide sequaice information of Ihe 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skUled artisan can readily appreciate tiiat any one of the currentiy available 

20 computar-based systems are suitable for use in tiie present invention. As stated above, tiie 

computer-based systems of the present invention comprise a data storage means having stored 
tiierein a nucleotide sequence of tiie present invention and the necessary hardware means and 
software means for supporting and in^)lementing a seardhi means. As used herein, "data storage 
means" refers to mranoiy whidi can store nucleotide sequence infonnation of the present 
25 invention, or a memory access means vidiidi can access uMnufectures having recorded thereon 

the nucleotide sequence information of the present inventioiL 

As used herein, "search means" refers to one or more programs vidridi are in^lemented 
on tiie computer-based system to con^jare a target sequence or target sbuctinal motif witii the 
sequence information stored within the data storage means. Search means are used to identify 

30 fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algoritimis are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software inchides, but is not limited to, 
Smitii-Waterman, MacPattem (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize tiiat any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in Ae present 
computer-based systems. As used herein, a "target sequence" can be any nucleic add or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
5 present as a random occurrence in the database. The mostpreferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

10 As used herein, "a taiget structural motif," or "target motit" refers to any rationally 

selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration whichis formed upon the folding of the target motif. There are 
a variety oftargetmotife known in the art Protein taiget motife include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motife include, but are not lunited 

15 to, promoter sequences, haiipm structures and inducible e?q»ression elements Orotem binding 
sequences). 

4.15 TRIPLE HELIX FORMATldN 

In addition, the fragments of the present invention, as broadly described, can be used to 
20 control gene expression through triple helix formation or antisense DNA or KNA, both of which 
methods are based on the bmding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene mvolved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney at al., Science 15241:456 (1988); and Dervan 
25 et al.. Science 251 : 1360 (1991)) or to the mRNA itself (antisense - Ohnno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-oflf of RNA transcription 
j&om DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be eflBective in model systems. 
30 Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or ejcpression of 
35 one of the ORFs of the present invention, or homolog thereoj^ in a test sample, usmg a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form flie complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

10 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
suflBdent to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

15 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaymg for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 

25 Amsterdam, The Netiierlands (1986); BuUock, G.R. et al.. Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of irmnunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishera, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 

30 sputum, blood, serum, plasma, or urine. The test sample used in tihe above-described method 
will vary based on the assay format, nature of tiie detection method and the ti^ues, cells or 
extracts used as the sample to be assayed Methods for preparing protein extracts or monbrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample vAnch is compatible with tiie system utilized. 
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In anothCT embodiment of the present invention, kits are provided which contain the 
necessary reagents to cany out Hxc assays of the present invention; SpedficaUy, the invention 
provides a compartment kit to recdve, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 

5 invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a boimd probe or antibody. 

In detail, a compartment kit indudes any kit in v*idi reagents are contained in separate 
containeis. Such containers include smaU glass containers, plastic containers or strips of plastic 
orp^. Such containers aUows one to effidently transfer reagents from one conqMitment to 

10 another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of eadi container can be added in a quantitative fesbion from one 
compartment to another. Such containers will include a container vMch wUl accept the test 
sample, a container wiiich contains the antibodies used in the assay, containers \4*ich contain 
wash reagents (such as phosphate buflfered saline, Tris-buffers, etc.), and containers vMch 

15 contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or m the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents v*ich are cqiable of 
reacting with the labeled antibody. One skUled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 

20 established kit formats v(dudi are well known in the art 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are use&l in medical 
imaging of sites expressing the molecules of die invention (e.g., where die polypeptide of the 
25 invention is involved in the nnmune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat NO. 5,413,778. Such methods involve diemical attachment of 
a labeling or unaging agetrt, administtation of the labeled polypeptide to a sulgect in a 
phaimaceutically acceptable carrier, and imagmg the labeled polypeptide in v/vo at the target 
site. 
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4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the presoot invention 
further provides methods of obtaining and identifying agaits which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 



89 



wo 01757190 PCT/USblWOW 
1 -984, 1 969-2952, 3937-3942 or 3949-3954, or bind to a speafic domain of tiie polypeptide 
encoded by tbe nucleic acid. In detail, said method comprises the steps of: " 

(a) contacting an agent with an isolated protein CTicoded by an ORF of the present 

invention, or nucleic acid of the invention; and 
5 (b) detemiiiiingwhetiier the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invaition can comprise contacting a compound with a polynucleotide of 
the invoition for a lime sufficient to form a polynucleotide/compound complex, and detecting 
flie complex, so that if a polynucleotide/con^jound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified. 

likewise, in genial, therefore, such methods for identifying compounds that bind to a 
polypeptide of fte invention can comprise contacting a compound wifli a polypqptide of tiie 
invaition for a time sufBcient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound tfiat binds to a 

1 5 polynucleotide of tiie invention is identified. 

Methods for identifyiiig compounds that bind to a polypqptide of the invention can also 
comprise contacting a compound with a polypeptide of Ibe invention in a cell for a time 
suJBBcient to form a polypeptide/compound conqilejc, v^^ierein tiie complex drives ejqwession of a 
receptor gene sequence in tiie cell, and detecting the complex by detecting reporter graie 

20 sequence expression, so tiiat if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds whidi modulate the 
activity of a polypeptide of the invention (tiiat is, increase or decrease its activity, relative to 
activity observed in tiie absence of tiie compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of tiie 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of tiie compound). Compounds, such as compounds identified via the methods of the 
invention* can be tested using standard assays well known to those of skill in tiie art for tiieir 
ability to modulate activity/expression. 

30 The agents screoied in tiie above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or otiier phannaceutical agtaits. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaoeutical agents and 
the like are selected at random and are assayed for tii«r ability to bind to tiiie protein encoded by 

35 tiie ORP of flie present invention. Alternatively, agents may be rationally selected or designed. 
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As used herein, an agent is said to be "rationaUy selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationaUy designed 
5 antipeptide peptides, fot example see Hurby et al., AppUcation of Synthetic Peptides: Antisense 
Pqrtides," In Synthetic Peptides, A User's Guide, WJL Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al.. Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of tiie present invention, as broadly 
described, can be used to control gene expression through binding to one of Ae ORFs or EMFs 
10 of tihe presrait invention. As described above, sudi agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of eiflier a single ORF or 
multiple ORFs which rely on the same EMF for ejqiression control. One class of DNA landing 
agents are agents w*ich contain base residues which hybridize or form a triple helix formation 
15 by binding to DNA or RNA. Such agents can be based on the clasfflcphosphodiester, 

ribonucleic acid backbone, or can be a variety of sulfhydiyl or polymaic derivatives \^ch have 
base attachment capacity. 

Agents suitable for use in tiiese methods prefraably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved iu transcription (triple helix - see 
20 Lee et al., NucL Adds Res. 6:3073 (1979); Cooney et al.. Science 241 :456 (1988); and Dervan et 
aL, Science 251:1360 (1991)) or to the mKNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, C31C Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, vMs antisense RNA hybridization blocks translation of an mRNA molecule into 
25 polypeptide. Both techniques have been demonstrated to be rffective in model systems. 

Information contained in tiie sequences of the presrait invention is necessary for fhe design of an 
antisense or triple helix oligonucleotide and otijer DNA landing agents. 

Agents y*ich bind to a protein encoded by one of the ORFs of flie present invention can 

be used as a diagnostic agent Agents which bmd to a protein encoded by one of the ORFs of the 
30 presCTLt invention can be formulated uang known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Anotiier aspect of tiie subject invention is to provide for polypeptide-specific nucleic add 
35 hybridization probes capsiAe of hybridizing witii naturally occurring nucleotide sequences. The 
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hybridization probes of the subject invention may be derived fiom any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived from of any ofthe nucleotide sequences SEQ ID NO: 1-984. 196^2952, 3937-3942 or 
5 3949-3954 can be used as an indicaftw ofthe presence ofRNAofceU type ofsuch a tissue in a 

samite. 

Any suitable hybridization technique can be employed, such as, for example, m situ 
hybridization. PGR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oUgonucleotides based upon the nucleotide sequences. Such probes used in 

10 PGR may be ofrecombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

15 are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means ofthe addition ofthe appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to amstroct hybridization probes for mappmg their re^ective genomic sequences. The 
nucleotide sequence provided herem may be mapped to a chromosome or specific regions of a 

20 chromosome usmg well known genetic and/or chromosomal mapping techniques. Ihese 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. Tie technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 

25 Chromosomes: A Manual ofBasic Techniques, Pergamon Press, New Yoric NY. 

Fluorescent in situ hybridization of chromosomal preparations and otiier physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic m^ data can be found m the 1994 Genome Issue of Science (265:1981f). Correlation 
between tiie location of a nucleic acid on a physical chromosomal map and a specific disease (or 

30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of flie subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 
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420 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 
Oligonucleotides, i.e., small nucledc acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 
5 Support bound oUgonucleotidesmEy be preparedby any of the methods known to those of 

skill m the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
preciselyspotohgonucleotidessynthesi2»dby standard synthesizere Immobilization can be 
achieveduang passive adsoiptiQn(Inouye& Hondo, (1990) J. Clin. NficrobioL28(6) 1469-72); 
UOTigUVUght(Nagatae/ai, 1 985; Dahlen era/., 1987; Monisseyft Collins, (1989) MoLCeU 
10 Probes 3(2) 189-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989); all 
references being specifically incorporatedherein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interactionas a Imker. For example, Bioude et oL (1994) Proc. Natl. Acad. Sd. USA 91(8) 3072-6, 
describe the use of biotinylatedprobes, ahhough these are duplex probes, that are immobilized on 
15 streptavidin-coatedmagnetic beads. Streptavidin-coatedbeadsmay bepurchasedfi'omDynal,Oslo. 
Of course, this same HTilong chemistry is appUcable to coating any surfece withstreptavidin. 
Biotinylated probes may be purchased fi^om various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories(Naperville,IL) is also selling suitable material th^^ Nunc 
20 Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Co valinkNH. CovaLinkNH is a polystyrene surface grafted with secondary a^ 
groups (>NH) that serve as bridge-heads for further covalent coupUng. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5*-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA 
25 (Rasmussenefo?., (1 991) Anal. Biochem. 198(1) 138-42). 

The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5*-end has 
been described (Elasmussen et al., (1991). In ttiis technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immohilizationusing 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
3 0 CovaLink NH secondary amino gcoups that are positioned at the end of spacer arms covalently 

grafted onto the polystyrene surfece trough a 2 nm long spacer arni, Tolinkanoligonucleotideto 
CovaLmkNH via an phosphoramidatebond, the oligonucleotide terminus must have a S'-md 
phosphate group. It is, perhaps, even possible for biotm to be covalently bound to CovaLinkand 
then streptavidin used to bind the probes. 
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More specifically, the linkage mefliod includes dissolving DNA in water (7.5 ngAil) and 
denaturingfor 10 min. at 95''C and cooling on ice for 10 min. Ice-cold 0.1 M l-melhylimidazole, 
pH 7.0 (1-Melm7), is&en added to a ifinal concentratiaQof 10 inM l-Melm?. A ss DNA. solution is 
then dispensed into a>vaLink>ra stiips (75 nl/weU) standing on ice. 
5 CaibodiimideO.2 M l^yl-3K3HJimefliylaminDpropyl)-carbodiimidc(EDC), dissolvedin 

10 mM l-Mebny, is made ftesh and 25 ill added per wdL He strips are incubatedfor 5 hours at 
SO'C. AftBX incubation the strips are washed using, e.g., Nunc-Iramuno Wash; first fte wells are 
wadied 3 times, then they are soaked with washmg solution for 5 min. , and finaUy Ihey are wa 

3 times (vtere in the washing solution is 0.4 N NaOH, 0.25% SDS healed to 50»C). 

10 Itis contemplatedthatafiirthersuitablemethodfbruse with Ihepresentinventionis that 

describedinPCT Patent Application WO 90/03382 (Southern* Maskos),incorporatedherein by 
reference. This method ofpreparing an oKgonucleotidebound to a si^jport involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiesterlink to aliphatic 
hydroxyl groups carried by the support. The oHgonucleotideis then synfliesized on the sigjported 

15 nucleoside and protecting groups removed from the synthetic oKgonucleotidechain under standard 
conditionsthatdonotcleavetheoUgonucleotidefiomihesiq)port Suitable reagents include 
nucleoside phosphoramiditeand nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrqrsmay be employed. For example, addressable laser-activatedphotodeprotectionmay be 

20 employed in die chemical synthesis of oUgonucleotides directly on a glass surfece, as described by 
Fodorerdl (1991) Science251(4995) 767-73, incorporatedherein by reference. Probes may also 
be immohihzedon nylon supports as described by VanNess et d. (1991)Nucleic Acids Res. 
1 9(12) 3345-50; or linked to Teflon using tiie method of Duncan & Cavalier (1988) Anal. Biodiem. 
169(1) 104-8; all refatences being specifically uicorporatedherdn. 

25 Tolinkanoligonucleotidetoanylonsiq>port,asdescribedbyVanNesserai (1991), 

requires activation oftiie nylon surfece via alkylationand selective activation of the 5*-amineof 
oligonucleotideswith cyanuric chloride. 

One particularway to prepare siq>port bound oligonucleotidesis to utilize the 
light-generatedsynthesisdescribedby Pease fir flt, (1994) PNAS USA91(11) 5022-6, incorporated 

30 heremby reference). These aulhors used currentphotolithogr^ctechniques to generate arrays of 
immobilized oUgonucleotideprobes (DNA chips). These methods, in which Kght is used to direct 
die synthesis of oUgonucleotideprobes m Mgh-^ensity, miniaturized arrays, utilize photolaWle 
5'-protected^'-acyl-deoxynucleosidephosphoTamidites,surfecelinke^chemistiyandy^^ 
combinatorial synthesis strategies. Amatrixof 256 spatiaUy defined oKgonucleotidepiobes may he 

35 gen^edin this manner. 
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421 PREPARATION OF NUCLEIC ACID FRAGMENTS 
The nucleic acids may be obtained fix)m any ^propriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrookcfai (1989) describes 
5 three protocols for the isolation of high molecular weight DNA from m am ma l i an cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared direcfly from genomic DNA or cDNA by PGR or oflier amplificationmethods. Sample 
may be prepared or dispensed in muWweU plates. About 100-1000 ng of DNA samples may be 
10 prepared in 2-500 ml of final volume. 

The nucleic adds would Aen be fiagmented by any of the meftods known to those of skill 
in the art including, for example, using restriction oizymes as described at 924-928 of Sambrooker 
aL (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as describedby Schrieferef a/. (1990) Nucleic 
15 AcidsRes. 18(24) 7455-6, incorporatedherein by reference). In fliis method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediatepressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a usefid alternative to sonic and enzymatic DNA 
fiagmentationmethods. 

20 One particularly suitable way for fragmenting DNA is contemplated to be that using the two 

base recognition endonuclease, CviJI, described by Fitzgerald et aL (1992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

25 The restriction endonuclease CvJI normally cleaves the recognition sequence PuGCPy 

between ttie G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this eniqmae (CwJI* *), yield a quasi-random distribution of DNA fragments form Ae small 
moleculepUC19 (2688 base pairs). Fitzgerald etaLX199Z) quantitatively evaluated the 
randomness oftiiisfiragmentation strategy, using a CWJI** digestof pUC19thatwassize 

30 fiactionatedby a rapid gel filtration method and direcfly ligaled, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CmOI** restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fiagmentatioiL 

As reported in the literature, advantages of this approach compared to sonication and 

3 5 agarose gel fi^tionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer stq)S are involved (no preHgation, end repair, chemical extraction, or agarose gd 
electFophore^s and ehition are needed 

Irrespective of the manner in MAichIhe nucleicacid fragments are obtmned or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
5 achieved by incubating the DNA solufionfor 2-5 minutes at SO-PO'C. The solutionis then cooled 
(juiddy to I'C to prevent lenaturationoftiie DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known m the ait 

422 PREPARATIONOF DNA- ARRAYS 

Arrays may be prepared by spotting DNA samples on a sappott such as a nylon monbrane. 

10 SpottingmaybeperfoimedbyusinganaysofmetaIpins(thepositionsof whichcorrespondto an 
array of wellsin amicrotiterplate)to repeatedby transferofabout20 nl of aDNA solution to a 
nylon membrane. By ofl&et printing, a density of dots higher than the density of «ie wells is 
achieved. One to 25 dots may be accommodated in 1 mm^ depending on the type of label used. By 
avoiding spotting in some preselectednumber of rows and columns, separate subsets (suban^) 

15 may be fonned. San^jles in one subairay m^ be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overl^ed genomic clones. Each of the 
subairaysmay represent repUca spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amphfied gene segment may be in 
one 96-weU plate (aU 96 wells containing the same sample). A plate for eadi of tiie 64 patients is 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 

Subairaysmay contain 64 samples, one from each patient Whrae tiie 96 subairays are identical, tiie 
dotspanmaybe 1 mm^andtheremay bea 1 mmspacebetweensubanays. 

Another approach is to use membranes or plates (available fromNUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded ovct tiie membrane, the grid 

25 being similarto tiie sort of membrane appUed to tiie bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferredfor imaging by exposure to flat phosphor-storage 
screens or x-iay films. 

Thepresentinventionis ilhistratedin tiie following exanqiles. Upon considerationof flie 
present disdosure, one of skill in tiie art wiU appreciate fliat many otiier embodiments and variations 
30 may be made in the scope offliepresMit invention. Accordingly, it is intended tiiat tiie broader 
aspectsof the presentinventionnot be limited to the disdosureof the following examples. The 
piesentmvantionis not to be limited in scope by tiie exemplified embodiments whidi are intended 
as illustrations of single aspects of the invention, and compositions and methods ^ch are 
fimctionaUyequivdditarewifliinfliescopeoftiieinvention. Indeed,numCTOUsmodificationsand 
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variations in the practice of the invention are expected to occur to those skiUed in the art i^n 
considerationofthepresentpreferredembodiments. Consequently.the only limitations which 
should be placed upon the scope of the invention are Ihose vMch qjpear in the qjpended clainas. 
All references ched within the body of the instant specificadonare hereby incoiporatedby 
5 referraiceintiieir entirety. 

5.0 EXAMPLES 

5.1 EXAMPLEl 

Novel Nncleic Acid Sequences Obtained From Various Libraries 
A pluraKty of novel nucleic acids were obtainedfiom cDNA Bbraries prepared fiom various 
10 human tissues and in some cases isolatedfiom a genomic Hbrary derived fiom human diromosome 
using standard PGR, SBH sequence signature analysis and Sanger sequencing techniques. Hie 
inserts of the library were ampUfied with PGR using primers specificfor the vector sequences whidi 
flank the insertsJ Clones from cDNA Ubrarieswere spotted on nylon membrane filters and screened 
with oKgonucleotideprobes (e.g., 7-mers) to obtain agnature sequences. The clones were clustered 
15 into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PGR products were purified and subjected to fluorescent dye 
teraiinatorcycle sequencing. Single pass gel sequencing was done using a 377 AppUed Biosystems 
(ABI) sequencerto obtain the novel nucleic acid sequences. In some cases RACE (Random 
20 AmpKficationofcDNAEDds)wasperfi)imedtofiirtherextendthesequenceinthe5' direction. 

5.2 EXAMPLE2 

A ssemblage of Novel Nucleic Adds 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 1969-2951, 
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 

25 used to extend the seed EST into an extended assemblage, by pulling additional sequences fixmi 
different databases (i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 
114, and UniGene version 101)thatbelongto this assemblage. The algorithm terminated when 
there was no additional sequences fix)m the above databases Aat would extend the assemblage. 
Inchision of componait sequences into the assemblage was based on a BLASTN hit to the 

30 extending assemblage with BLAST score greatwthan 300 and percent identity greaterthan 95%. 

Tables6 and 8 setsforthHienovelpredictedpolypeptides(inchidingpK>teins)encodedby 
the novel polynucleotides (SEQ ID NO:2953-3936, and 3949-3954) of the present mvention, and 
{heir conesponding nucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted Method A refers to a 
polypeptide obtained by using a software program called FASTY (available fix)m 
httyy/fasta.b i^>^ virpinia.edu^ which selects a polypeptide based on a comparison of the translated 
novel polynucleotideto known polynucleotides(W.IL Pearson, Methods in Enzymology, 183:63-98 
5 (1990),hereinincoiporatedby reference). Method B refers to a polypeptide obtained by usmg a 
softwaieprogram called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositionalproperties (C.^Burge and S. Karlin, J. MoL BioL, 268:78-94 
(1997), incbiporatedherein by reference). M^odC refers to a polypeptide obtained by using a 
10 Hyseqproprietarysofhvareprogramtibattranslatesthe novel polynu^^ complementary 
strand into ax posable amino acid sequences (forward and reverse firames) and chooses ttie 
polypeptide with the longest open reading frame. 

53 EXAMPLES 
Novel Nucleic Adds 

15 Using PHRAP (Univ. of Washington) or CAP4(Paracel), full length gene cDN A sequences 

and their correspondingpiotein sequences wctc generated fiom the assemblage. Any frame shifts 
andincoirectstopcodonswerecorrectedby hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 

20 ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The fijll-lengthnudeotide sequences are shown in the 
Sequence Listing as SEQ ID NO:l-351. The amino adds are SEQ ID NO:985-1335. 
Table 1 shows the various tissue sources of SEQ ID NO: 1-351. 

The nearest neighbor results for SEQ ID NO: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WadiU search against Genpept release 120 and Graeseq October 12, 2000 release 

25 21 perwent), using BLAST algorithm. The nearest neighbor result showed the closest 

homologue for SEQ ID NO: 1 -35 1 fi^om Genpept . The translated amino acid sequences for 
\^ch the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 1-351 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

30 Biol., Vol. 6 pp. 219-235 (1999) herem incorporated by reference), all the sequences were 
examined to determine wheflier they had identifiable signature regions. Table 3 shows the 
signature region found in the mdicated polypeptide sequences, tiie description of the signature, 
tiie eMatrix p-value(s) and tiie position(s) of tiie signature within the polypeptide sequence. 
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UsiBg the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
Kcamined for domains with homology to certain peptide domains. Table 4 shows the name of 
tiie domain foimd, the description, the p-value and tiie pFam score for the identified domain 
5 ^fhin Resequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
tihieir deavage ates can be determine fiom using Neural Network Signal? VI. 1 program (fiom 
Center for Biological Sequence Analysis, Tlie Technical University of Denmark). The process fat 
identifying prokaryotic and eukaiyotic signal peptides and their cleavage sites are also disclosed by 
10 Henrik Nielson, Jacob Engelbrecht, Soren Brunak, aud Gunnar von Hdjne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, VoL 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shows the position of the signal pq)tide in eadi of Ae polypeptides 
15 and the maximum score and mean scare associated wilii that signal peptide. 

5.4 EXAMPLE 4 
Novel Nucleic Adds 

Using PHRAP (Univ. of Washiugton) or CAP4 (Patacel), a full length gene cDNA 
20 sequaiceandi1scorreqx)ndingi«roteinsequenceweregeneratedfiom1heassemblage. Anyfiame 
shifts and incorrect stop codons were conectedby hand editing. During editing, the sequence was 
checked using PASTY and/or BLAST a^inst Genbank dbEST version 1 1 7, gb pri 1 17, 
UniGenevCTsion 117, Genpept release 117). Ofliercomputerprograms\»*ichmay have beeai used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready,edr 
25 extandgc-zip-2(Hyseq,Inc.). The fiill-length nucleotide, including splice variants resulting fiom 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 352-766. The coneqxmding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLAST? 
30 version 2.0al 19MP-WashU search against Gei^ept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 fiom Genpept . The translated amino acid sequences for 
v^ch the nucleic add sequence encodes are shown in the Sequence Listing. The homologs vnOi 
identifiable fimctions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, tibe description of the signature, 
5 the eMatrix p-value(s) and flie position(s) of the signature witiiin the polypeptide sequence. 

Using the pFam software jprogram (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) hercm incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, flie description, the p-value and the pFam score for the identified domain 
10 within Resequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from usmg Neural Network Signal? VLl program (from 
Center for Biological Sequence Analysis, The Te(±d[cal University o The process 

for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
15 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Bnmak, and Gunnar von Heijne in the 
pubUcation " Identtficadon of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows tiie position of the signal peptide in 
20 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5^ EXAMPLES 
Novel Nucleic Adds 

Using PHRAP (Univ. of Washington) or C AP4 (Paracel), a fiill length gene cDNA 
25 sequence and its coirespondingprotein sequence were generated &om the assemblage. Any fiame 
shifts and incon^ect stop codons were corrected by hand edito During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 18, gb pri. 118, 
UniGene version 1 1 8, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
30 extandgc-zip-2(Hyseq,Inc.). The fuU-length nucleotide, including spUce variants resulting 

theseproceduresareshownintheSequenceListingasSEQIDNOS: 767-930. The corresponding 
amino acid sequences are SEQ ID NO:1751-1914. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algoriUim. Has nearest naghbor result showed homologs for 
SEQ ID NO: 767-930 from Genpqit. The translated amino acid sequences for v*ich ^ nucleic 
5 add sequence encodes are shown in the Sequence Listing. The homologues witti identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software padcage (Stanford University, Stanford, CA) (Wu et al., J. Cpmp. 
Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
Kcamined to determine v^^ether they had identifiable signature regions. Table 3 shows the 
1 0 signature region found in the indicated polypeptide sequences, the description of the signatore, 
Ae eMatrix p-value(s) and the position(s) of the signature within tbe polypeptide sequence. 

Using the pFam software program (Sonnhanuner et al., Nucldc Adds Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by referaice) all flie polypeptide sequoices were 
examined for domains wifli homology to certain peptide domains. Table 4 shows the name of 
15 the domain found, the description, the p-value and the pFam score for the identified domain 

within the sequence. 

The nucleotide sequence within the sequences tiaat codes for signal peptide sequences and 
tiieir cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical Univereity of Denmark). The process 

20 for identifying prokaryotic and eukaiyotic signal peptides and their cleavage sites are also 

disclosed 1^ Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A TT^^^ vi'"^nT^ S score and a mean S score, as described m the Nielson et as reference, 

25 was obtained for flie polypeptide sequences. Table 7 shows the position of the signal peptide in 
eadi of the polypq>tides and the inaxunum score and mean score assodated with that signal 

peptide. 

5.6 EXAMPLE 6 
Novel Nudeic Adds 

30 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a fall length gene cDNA 

sequence aai its correspondingprotein sequence wrae generatedfiom the assemblage. Any fiame 
shifts and incorredstopcodoriswerecairectedby hand editing. During editing, the sequence was 
checked uang FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pri 1 1 8, 
UniGene version 1 18, Genpept release 1 1 8). Otiier conqjuter progranis whidi may have beem used 
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in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
extandgc-zip-2(Hyseq,IncO. The full-length nucleotide, including splice variants resulting from 
theseprocedmesaie diownin the Sequence Listing as SEQ ID NOS: 931-965. The corresponding 
amino add sequences are shown in SEQID NO:191 5-1949. 
5 Table 1 shows the vaiioustissue sources of SEQ ID NO: 931-965. 

The nearest neighbor results for SEQ ID NO: 931-965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed flie closest 
homologue for SEQ ID NO: 931-965 from Geiq)ept , ITie translated amino add sequences for 
10 whidi the nuddc acid sequence encodes are shown in the Sequence listing. The homologs 
with identifiable functions for SEQ ID NO: 93 1-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had ideiitifiable signature regions. Table 3 shows Ae 
1 5 signature region found in the indicated polypeptide sequences, the description of the signature, 
die eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucldc Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all flie polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
20 the domain found, the descriptioDi, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine fi»m usmg Neural Network SignalP VI .1 program (Smm 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

25 for identifying prokaryotic and eukaryotic signal peptides and thdr cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal pq>tides and prediction of then: 
- cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), mcoiporated herein by 
reference. A ipaiHnunn S score and a mean S score, as described in the Nielson et as reference, 

30 was obtained for tiie polypeptide sequences. Table 7 shows flie position of the signal peptide in 
each of the polypeptides and the maximum score and mean score assodated witii that signal 
peptide. 

5.7 EXAMPLE 7 
Novel Nucleic Adds 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a faU length gene cDNA 
sequence and its conesponding protein sequence were generated from the assemblage. Any ftame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 119, gb pri 119, 
5 UniGenevereionll9,Gcnpeptieleasell9). Oflier computer programs which may have been used 
in fte editing process were phredPhrap and Consed (University of Washmgton) and ed-ready, ed- 

ext and gc-zip-2 (Hyseq, Inc.). The fiill-lenglh nucleotide, including spUce variants resulting fiom 
theseproceduresareshownintheSequenceListingasSEQIDNOS:966-974. Hie coneq)onding 

amino add sequences are SEQ ID NO: 1950-1958. 
10 Table 1 showstiie various tissue sources of SEQ ID NO: 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Geiq)ept release 120 and Geneseq October 12, 2000 
release (Derwent)„using BLAST algorithm. The nisarest ndgjibor result showed the closest 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino add sequences for 
15 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable fimctions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMalrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequence were 
. examined to determine whether they had identifiable signature regions. Table 3 shows the 
20 signature region found in the indicated polypeptide sequences, the description of tiie signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
25 the domain found, the description, the p-value and the pFam score for the identified domain 
witiiin the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine fiom using Neural Network SignalP VI . 1 progrmn (fix)m 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

30 for identifying prokaryotic and eukaryotic signal peptides and thdr cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engdbrecht, Soren Brunak, and Gunnar von Hdjne in the 
publication " Identification of prokaryotic and eukaryotic signal pq>tides and prediction of &dr 
cleavage sites'* Protein Engmeering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

35 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
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eadi of tbe polypeptides and tibe maximum score and mean score associated with fliat signal 



peptide. 

5.8 EXAMPLES 
yjovel Nttdeic Adds 

5 UsingPHRAP(Univ.ofWashington)orCAP4(Paracd),afuUlengthgOTe 

sequence and its cotrespondingprotein sequence were generated from flie assranblage. Any ftame 
shifts and incoirectstop codons were correctedby hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, ^ pii 120, 
UniGeneversion 120, Genpept release 120). Other computer programs viiich may have been used 

10 in the editing process were phredPhrap and Censed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The Ml-length nucleotide, including splice variants resulting fiom 
these procedures are shown in the Sequence Listing as SEQ ID NOS:975-984. ITxe corresponding 

amino acid sequences are SEQ ID NO:1959-1968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 

15 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 975-984 from Genpept . The translated amino acid sequences for 
vMch the nucleic acid sequence encodes are shown in fhe Sequence Listing. The homologs 

20 wifli identiiSable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al,, J. Coxnp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all 4e sequences were 
examined to deteraune Aether they had identifiable signature regions. Table 3 shows the 
signature region found in indicated potypq)tide sequmces, the description of fhe signature, 

25 die eNfattix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using flie pFam software program (Sonnhammer et al., Nucldc Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains wifli homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, ihe p-value and Ae pFam score for the identified domain 

30 within the sequence. 

The nucleotide sequence wifliin the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine fiom using Neural Network Signal? VI. 1 program (fi:om 
Center for Biological Sequence Analysis, The Technical University of Denmaric). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, aad Gmmar von Heijne in the 

publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of tiieiT 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as refiaience, 
5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide m 
each of the polypqrtides and the maximnm score and mean score associated with that signal 
peptide. 

5S EXAMPLE9 
Novel Nucleic Adds 

10 Using PHRAP GJniv. of Washington) or CAP4 (Paracel), a fuD length gene cDNA 

sequence and its coiresponding protein sequence were generated from the assemblage. Anyfiame 
shifts and mconect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST agamstGenbank(i.e. dbESTvarsion 120,gbpri 120, 
UniGene version 120, Genpept release 120). Other computerprogramswhich may have been used 

15 m flie editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-2^2 (Hy seq. Inc.). The Ml-lengfli nucleotide, including splice variants resulting ftom 
theseproceduresare diownindie SequenceListing as SEQ ID NOS:3937-3942. The 
oonespondingpeplide sequence is SEQ ID NO: 3943-3948. 

Table 1 shows the vaiioxis tissue sources of SEQ ID NO: 3937-3942. 

20 The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a BLAST? 

version 2.0al 19MP-WashU seardh against Genpept release 120 and Geneseq October 12, 2000 
release 21 0>erwent), using BLAST algoiidim. The nearest ndghbor result showed die closest 
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid sequences for 
M*ich the nucleic add sequence encodes are shown in the Sequence Listing. The homologs 

25 with identifiable functions for SEQ ID NO: 3937-3942 are shown in Table 9 bdow. 

Using eiNfetrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by referaice), all the sequsaices were 
examined to detamine vi^ether they had idoitifiable signature regions. Table 10 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

30 the eMatrix p-value(s) and the position(s) of the signature widiihthe polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucldc Acids RfiS., VoL 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequenceis were 
exammed for domains wifli homology to certam peptide domains. Table 1 1 shows the name of 
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the domain found, the description, the p-value and tiie pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VI . 1 program (from 

5 Center for Biological Sequence Analysis, The Technical University of Denmaik). The process 
for identifying prokaiyotic and eukaiyotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Bigelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication ** Identification of prokaryotic and eukaiyotic signal peptides and prediction of their 
cleavage sites" Protein Engmeering, Vol. 10, no. 1, pp. 1-6 (1997), incoipoiated herein by 

10 reference. A maximum S$core and a mean S score, as described in the Nielson etas reference, 
was obtamed for flie polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

15 Tables 5 and 13 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 
Source 


Library 
Name 


SEQ ID NQS; 


lung 






3 11 25 49 65 75 114 141 156 160 172 
190198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult farain 


GIBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 112 114-115 117 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 312 314 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 71 1 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GIBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126 132-133 135 
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139 143 146 148-149 159 163 168 174 
176 179-180 184-185 188-190 202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409- 
4i0 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 
933 943-944 947 952-953 958 962-963 
965 967 972 977 


adult brain 


Clontech 


ABROOl 


3 53 66 113 115 126 135 160 172 179 185 
204 263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 461 542 
583 586 606-607 61 1 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 
977 


adult brain 


Clontech 


ABR006 


19 32 49 53 60 72 91 103 118 125 130- 
131 134 184 224 275 338 350 354 361- 
363 374 384 390 394 396 431-432 434- 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 811 
818 887 903 906 918 930 942 947 957 
973 977 


adult brain 


Clontech 


ABR008 


2-3 9-1 1 14 17 21 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 101 103 112-115 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 

y\e\^ f^ac ^or\ ortl ono '^QA OO^ 00*7 

282-285 289-291 Zyi-Zy'* Zyo-zy/ ovi 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 
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* 


403 405 409-412 414 418-421 423-424 
426-427 430 433-437 443 445-450 452 
456-457 460 462 464 471 479 482-483 
485 488 490-498 505 507 510 516 519- 
522 524 527-532 535 538-539 542-545 
548 551 553 555 561-562 566 569 571 
574 580-583 588-589 593 597 601-608 
611-612 614-615 617-618 621-622 624 
630-635 642 644 646-648 650-652 655 
657 659-661 664-665 668 672 674 689 
693-699 701-702 708 711 715 717 724 
728-730 732 734-735 738-740 745 747- 
750 753-755 757 761 763-764 766-769 
772-773 775 780-781 789-791 793-795 
799-800 802-806 809 812 818-819 821- 
822 826 829-830 832 834-835 841 843 
845 856 858-859 861 864 866 870 872 
876 880 883 885 887 893-898 902 906- 
916 918 921 925-926 930-931 933 942- 
943 946 948 950-951 953-954 958-960 
962-965 967969-970 972 977 


adult brain 


Clontech 


ABROll 


57 196 270 304 344 436 834 


adult brain 


BioChain 


ABR012 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 
790 826 880 


adult brain 


Invitrogen 


ABR014 


293 394 399 764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 


738-739 764 


adult braiii 


Invitrogen 


ABR016 


320 374 396 399 405 684 742-743 767 
931 947 967 


adult brain 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
158 167-168 185-188 194 200 212 232 
242 246 255 258 270 277 279 293 301 
312-313 319322-323 331 341 346 348 
371 374 388 391 394 399 401 409 411 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 


cultured 
preadipocytes * 


Strategene 


ADPOOl 


4 28-29 69 93 114 121 132-133 135 151- 
152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- . 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 3»» 
394 401 403 405 411 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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693 729 746 761 765 769 834 842 848 
887 907 923 947-950 957 967 969 


adrenal gland 


Clontech 


ADR002 


1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384 401-402 405-406 416 420 
431 437 444 446 448 457 462 484 500 
507 517 524 532-533 539 545 554 561- 
562 564 588 597 602-603 oOo-o07 o35 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 


adult heart 


GIBCO 


AHROOl 


13-4 810 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100 108 112-115 117-121 123-124 
128-133 141 144-146 149 152 159 162- 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209212 216-218 
221 223 227 229 233 244 247 249 253- 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425^27 430-431 
436 452-453 464-465 470-474 481-484 
487-488 490 492-494 496 499-500 505- 
506 508-509 514 523 529-530 533 547- 
548 553 558 563-565 577-578 586-588 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626r628 637-638 642- 
644 652 658 661 672 682-683 688 691 
693 697 699 708 711 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 »46 859 8ol- 
862 867 876-877 887 891-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 


adult kidney 


GIBCO 


AKDOOl 


1.3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-1 10 1 14- 
llfi 118-121 12'?-125 128 130-133 135 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
1 94 196 200-202 204 209 211-212216- 
217 219 221 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AKT002 


1 3 16 21 30 32 35 38-41 464 7 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301303 308 314 333 355 363 
372 380 383 396 399 402 418-419 426- 
427 431 448 454 461 471-474 488-489 
495 498 504 506 508-509 520-521 530 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 


aduhlimg 


GBCO 


ALGOOl 


1 3 14 18 28-29 38 54-56 59 92 110 114- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410 420-421 426- 
427 43 1 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 5^0 oli 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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967 


lymph node 


Clontedi 


ALNOOl 


3 10 1 10 146 160 168 196 209 221 269 
278 301 336 348 394 405 411 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 


young liver 


GIBCO 


ALVOOl 


3 14 16 37-38 41 51 56 60 97 104-105 
108 110 117 119 128 130-131 134 139 
149 152 169-172 176 184 189-190 200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 711 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-91 1 
949 958 965 969 972-973 


adult liver 


Invitrogen 


ALV002 


3 37 42 56 60 71 82 104-105 114-115 
117-118 125 130-1131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 364 368 372 
376 398-399 402 426-427 439 442 451 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 
587 594-595 604-605 608 610 621 630- 
631 634-635 637 657 664 690 693 699 
723 726 745 751 763 767 784 793 811 
822 845 848 852 856 861-862 864 892 
899 908-909 925 950 958 967 983 


adult liver 


Clontech 


ALV003 


60 134 169-171 275 


adult ovary 


Invitrogen 


AOVOOl 


1 3 9-10 12-14 16 18 20 22-25 28-29 33- 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
21 1-212 214 217 219 221 224 226 232- 
235 240-242 246-247 249 251 254-255 
258-259 264 269-271 274 276-277 279- 
283 285 288 290 293-294 297 301-304 
306-308 311 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 
354-358 361-363 365 368 370-372 374 
376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
421 423 425-433 438 442-443 449-452 
454 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-911 
913 919-922 925 928 930 936 939-940 
943-944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 


adult plac«ita 


Invitrogen 


APLOOl 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 

856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 


adult spleen 


GIBCO 


ASPOOl 


1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 7oy-77U 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GIBCP 


ATSOOl 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209211-212 
214 221 223 230 254-255 258 263 269 
283 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 593 608 616-618 
620 623-624 638 642-643 697 699 708 
71 1 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 


Genomic DNA 

fiomBAC 

63118 


Research 
Genetics 
(CITB BAC 
Library) 


BACOOl 


515 


Genomic DNA 

fiomBAC 

39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC002 


640 


Genomic DNA 

fiomBAC 

39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC003 


640 


adult bladder 


Invitrogen 


BLDOOl 


50 55 66 71 111 143-144 148 160 201 209 
223 255-256 280-281 286 305 315 319 
340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 


bone mairow 


Clontech 


BMDOOl 


3 10-13 16 18 20-21 25 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 1 10 1 14-1 15 118- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209211217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267 269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388.394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 511 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 71 1 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
817-818 822 834 839-840 842 848 862 
866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 


bone marrow 


Clontech 


BMD002 


3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82 102 116 119 130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 41 1 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 769-771 775-778 784 787 811 
813 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD004 


54 


bone marrow 


Clontech 


BMD007 


766 887 928 


adult colon 


Invitrogen 


CLNOOl 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixtuieof 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL016 


358 740760 


Mixture of 16 

tissues - 
mKNAs* 


Various 
Vendors* 


CnL021 


468 527928 


adult cervix 


BioChain 


CVXOOl 


1 3 10 14 22 28-30 37 41 47-48 51-52 54- 
57 71 82 89-90 92 106 108 110-111 117- 
118 121 129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 21 1-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-270 274 277 282 285 292 295 297 
305-308 314-316 319 328 343-344 348 
354 358 363 368 380 382-384 389 394 
396 399 401 405-407 410 416 418-421 
428 430-431 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 
590 608 611 613 619 621 623 628 630- 
631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 



* The 16 tissue-mRNAs and their vendor souzce, are as follows: 1) Nonnal adult bram mRNA (Invitrogen), 2) 
nonnal adult kddncy mRNA (Invitrogen), 3) nonnal adult liver mRNA (Invitrpgen), 4) nonnal fetal brain mRNA 
(Invitrogen). 5) normal fetal kidoey mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skm mRNA (faivitrogen), 8) hmnan adrenal gland mRNA (Clontech), 9) hmnan bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Qontedi), 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Qontedi), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Qoatech), 15) 
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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779-780 784 788 810-811 813-815 822 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969973 


di^hragm 


BioChain 


DIA002 


3 39 184 203 431 563 848 967 


endothelial 
cells 


Strategene 


EDTOOl 


3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109 114-115 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
194-195 200 203 208-209 212 216-217 
219 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 
271 274 276-282 285 290-291 294 297 
301-304308 311 313-314 316-317 320- 
321 323 325-326 328-329 331-332 334- 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 481 490 492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 602-608 610-611 613 617-622 625 
628 630-631 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 698-699 708 712 715 717 720-722 
724 727 729 740 745 748-750 752 761 
765 767-770 772-773 779 784 789 792- 
794 796 802-803 811 817-818 821 824 
827-828 830 834-835 837 842 845 848 
859 861-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-963 967 973 978 984 


Genomic 
clones from the 
short arm of 
chromosome 8 


Genomic 
DNAfrom 
Genetic 
Research 


EPMOOl 


324 515 640 


esophagus 


BioChain 


ESO002 


97 103 128 371 474 


fetal brain 


Clontech 


FBROOl 


67 129 156 159 232 267 433 446 503 845 
952 


fetal brain 


Clontech 


FBR004 


28-29 185 213 277 350 384 432 485 501 
549 651 747 754 761 780 787 848 870 

887 906 958 


fetal brain 


Clontech 


FBR006 


10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97 101 115118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197 203 210 212 214 219 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 61 1 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
nn^ lart tsi 700 801 ftOS XIX 892-823 
Sl'i'i sun iUS X^A 859 864 867 876 880 
885 887 890 893-894 896 913 918 926 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRs03 


130-131 312 517 637 691 738-739 


fetal brain 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 

/TOyf /ZAn fiA'X f^AH AAfl ^^Ci f%7Q <\M f\Q% 

/;qq 710 71^ 7A0-7A'^ 74S 748-749 753 
768-769 793 797 829-831 834 845 848 
856 859 893-894 908-909 913 916 931 
933 940 950 967 969 


fetal heart 


Invitrogen 


FHROOl 


19 57 130-13 1 394 43 1 642 769 844 


fetal kidney 


Clontech 


FKDOOl 


3 31 33-34 38 48 54 72 160 208t209 211 
223 264 269 277 283 290 313 325 341 
"idH 'i^H '^Q^ 41 8-420 474 484 506 508- 
509 517 520-521 532 547 553 558 567 
569 587 596 608 610 613 619 622 626- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19474 726 903 


fetal kidney 


Invitrogen 


risX'UU/ 


1 1 8 1 Rfi-1 87 230 244 271 432 887 969 


fetal limg 


Clontech 


FLGOOl 


69 132-133 156 168 208-209217267269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 
925 


fetal lung 


Invitrogen 


FLG003 


■ 3 8 28-29 32 39 50 66 82 88 92 168 1 86- 
187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 


fetal lung 


Clontech 


FLG004 


130-131 394 664 769 942 


fetal liver- 
spleoi 


Columbia 
University 


FLSOOl 


3 8-10 12-13 16-17 19-25 27-29 33-35 37- 
38 41 454 6 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209 21 1-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399401-411 413-414416418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 511 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
71 1 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 811 813 817-819 822- 
825 830-831 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 
903 905-911 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967 96^970 972-973 
976-977 981-983 


fetal liver- 
spleen 


Columbia 
University ' 


FLS002 


3 8-13 15-17 19-20 22 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 
212 214 2I0-2I0 JLZo-Zi\J LiL- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-411 413 418- 
421 429 431 439-440 442^ 451-452 
457 462-463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 711 713 715 
717-719 723-727 729 731-734 738-739 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 
865 867 874-878 888 891-892 896-900 • 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 
752 770 782 928 930 947 949 


fetal liver 


Invitrogen 


FLVOOl 


37 55 60 69 72-73 97 104-105 108 113- 
114 116-118 121 135 143 152 167-168 
1 86-1 87 195 200-201 209 217 223 240 
244 253 255 275 284 301 311 314 317 
336 342 348-349 358 371 374 382 394 
402 411-412 418-419 428 430 442 453 
517 568-569 580 582 584 587 589 601- 
603 606-608 617-618 624 634 639 642- 
644 646 664-665 669 679 715 717 720 
726 745 748 751 769-770 782 791 794 
797 824 830-831 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetalliver 


Clontech 


FLV002 


72 418-419 632 


fetal liver 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


Invitrogen 


FMSOOl 


15 27 32 37 67 72 83 99 1 12 121 138 167 
174 177 186-187 190 203-204211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911923 948 967 
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fetal muscle 


Invitrogen 


FMS002 


15 99 130-131 223 361-362 431 474 505 
581 639 643 666-667 784 790 808 810- 
811 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 


Invitrogoi 
• 


FSKOOl 


3 6 20-22 32-34 41^5 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 111-112 115 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 958 962-964 967 975 


fetal skin 


Invitrogen 


FSK002 


3 130-131 146 194 306 354 367400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 7^7 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSPOOl 


276 563 842 


umbilical cord 


BioChain 


FUCOOl 


3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 
769 nA-ll5 793 797 oU/ olo aJU oil 
848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 
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fetal brain 


GIBCO 


HFBOOl 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 431-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 
689 693 696-697 711-712 715 724 726 
731 735 745 747-749 752 754 761 765 
767-770 774 779-781 784-786 789 799- 
800 802-803 813 '818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 
896-897 900 906-907 910-911 918 921- 
922 925 927-928 930 943-944 946r947 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMPOOl 


86 168 186-187 297 537 608 681 761 845 
877 


in&nt brain 


Columbia 
University 


IB2002 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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613 616-618 620 622 624 629-632 634- 



635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
711 720-722 724 730 732 735 740 745-. 
748 754 765-766 768-769 779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-911 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 



in&nt brain 



Columlna 
Univeraty 



IB2003 



3 12-13 21 27-29 32 39 49 69 72 82 91 
113 116 126 128 132-133 142 144 156 
176-177 184-185 188 194 208 212 223- 
224 228 230 244 255 259 267 270 273 
276 293-294 312 320 326-327 337 342 
346 354-355 358 361-363 382 388 390 
394 396 399 402 420 425 431 442 462 
474 482 484 488 495-496 510 520-522 
524 529 540-541 549 563 582 586 588- 
589 596 600-603 606-607 612 617-618 
620-621 632 647 650 679 720-722 724 
735-736 746 751 754 769 785-786 793 
800 807 811-813 818-819 822 824 831 
834 838-840 843 856 864 892 896 907 
919-920 925 930-931 936 947 950 957 
973 982 



infant tedn 



Columbia 
University 



IBM002 



16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921926 971 



in&nt brain 



Columbia 
UnivCTaly 



IBSOOl 



84 86 180 185 198 201 203 230 279 312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 
822 827 910-91 1 925 931 



lung, fibroblast 



Strategene 



LFBOOl 



3 11 25 49 65 75 1 14 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 . 



lung tumor 



Invitrogen 



LGT002 



1 3 9-10 12-13 20 31 38 41 46 48 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200 203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286290 292 
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1 


294 297 301 308-309 311 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 

OAO OCC OCT SCO fi/^O fi<%A ft^ifi fn(\ KTS- 

848 od5 o5/ ojy oOZ oOt oOO o/u o/j- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 


lymphocytes 


ATCC 


LPCOOl 


3 9-1 1 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 311 314 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 55? 571 

cnn £.t\A £.t\c ^O/l #^Ofi ^1*7 

666^(n 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 


leukocyte 


GIBCO 


LUCOOl 


13 911 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78- 
80 82 89-90 93 99 110 115-121 123-124 
128-133 135 138 141 143-146 149 1512 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 21 1-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 
460-461 468-471 474 476 479-482 484 
492-494 496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 711 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 

906-911 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 


leukocyte 


Clontech 


LUC003 


1 41 82 106 119 123-124 160 177 184 201 

212 221 228 271 2/y 2od Zyj iZi iz:> 
372 394 411-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769 775 789 
809 867 887 923 928 950 


melanoma 
from cell line 
ATCC#CRL 
1424 


Clontech 


MEL004 


3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 
507 524 532 539 5o0-5o2 DKl-poz 36/ 
589 599 612-613 617-621 623 643 657 
663-664 672 715 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 


mammary 
gland 


Invitrogen 


MMGOOl 


1 14 19 21 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 41 1-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 
580 582 584 587-589 593 597 601-610 
612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650^657 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- 



123 



wo 01/57190 PCT/USOl/04098 









739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
Q01 ft9A R'^n R'^'^ R4X ftSfi 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-911 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969972-973 


induced neuron 
cells 


Strategene 


NTDOOl 


9 65 82 92 106 113 142 146 156 172 176 

1 01 one 001 o^R 977 '^9R '^'^'^ '^46 '^61- 
15^1 ZVo zZl Zjo z / / jzo djd jfu jui 

362 371-372 375 388 410 414 418-419 

440 471 484 495 516 524 529-530 592 

Kin rfwa fiAO MO 748 752 761 793 

OIU OZo CrX OJv ffj f*TO ij^ /vx 

818 848 851 897 


retinoid acid 
induced neiiron 
cells 


Strategene 


NTROOl 


19 87 184 305 385 440 474 626-627 643 
748 799 834 977 


neuronal cells 


Strategene 


NTUOOl 


19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 

'»•••'» "itn 100 lAf\ IKO "XKI VJQ "XQA 
312 317 329 340 oOI-joZ jo / ily jyt 

399 401 410 420 426-427 474 479 507 

530 579 582-583 610 617-618 636 643 

^ro nAi\ TiC^ HiZO n9A 701 70^ 700 

658 732 740 /o3 /ov iiyr /yi /yj lyy 

OAo ona CI C CAO ft^l SLfA, R07 007 Q^9 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 

793 893-894 


placenta 


Clontech 


PLA003 


138 176 574 896 972 


prostate 


Clontech 


PRTOOl 


3 9 16 57 65 75 83 108 130-134 138 141 
146 149-150 159 182 186-187 190 203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 

cAc cAiC «oa ^'xn ^fA. '^tt^ fJODjfX^ 
505-500 jZj 3>/ J*rj joh Ood ouz-ouj 

611 619 623 643 650 697 711 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 


rectum 


Invitrogen 


RECOOl 


19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 

A'^f\ /lO^ AAn dA/i 4^0 4R'^ 4RS 

•tzX) 4Zj HtA 'rrO 'rjy tOJ tOJ jmj~jax 

532 545 559 580-581 584 592 602-607 

610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-91 1 
914-916 934 937-938 942 967 973 982 


salivary gland 


Clontech 


SALOOl 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 
395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
981 



124 



wo 01/57190 PCT/DSOl/04098 



salivary gland 


Clontech 


SALs03 


217 254 270 388 610 


skin fibroblast 


ATCC 


SFBOOl 


517 949 


skin fibroblast 


ATCC 


SFB002 


269 688 


skin fibroblast 


ATCC 


SFB003 


3 203 897 907 


small intestine 


Clontech 


SINpOl 


3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 71 1 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 
911 913 948 953 959 976 984 


skeletal muscle 


Clontech 


SKMOOl 


15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 


skeletal muscle 


Clontech 


SKMs04 


215 


spinal cord 


Clontech 


SPCOOl 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160 168 172 
176 188 190 205 209 229 232 258 285 
301 308 312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
431 449-450453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539558 581 586 604-605 611 
619 623 630-631 633 656 663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931 953 958 


adult spleen 


Clontech 


SPLcOl 


3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 


stomach 


Clontech 


STOOOl 


35 114 130-131 144 155 176 189206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamus 


Clontech 


■mA002 


30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THMOOl 


10 16 20 28-29 32 37 41 52 57 66-67 74- 

•nc iin 110 101 10Q I'll 1/11 1^1 1 1 

75 110 118 121 Xly-Vii. iHl 131 iDy-iou 
208 211 218 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 41 1-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 611 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 


thymus 


Clontech 


"imicCC 


"1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 1 12 1 15 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 611 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 ^84 686-693 697 713 717 720 . 
728 740 746 749 760-762 767 771 775 
794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 


thyroid glaad 


Qontedh 


THROOl 


3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146 148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 21 1 215-21 8 228-229 232-236 
244 254-255 258 273 282 290 292 294 
297 303-306 308 311 317-318 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 
848-849 862 864 868-869 871 874 876- 

OTH OCT Q01 9QA ao/i 907 OrtT-OOQ Q\1 
oil 00/ oyj-oyn oyO-oyi y\ii-y\iy yi^ 

919-921 923 925 928 936 940-942 944 

946-947 950 953 955 958-959 962-963 

967 969 973 981 


trachea 


Clontech 


TRCOOl 


33-34 55-56 69 74 163 172 190 209 212 
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267 270 297 305 314 352 413 426-427 
466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 


Uterus 


Clontech 


UTROOl 


4 9 18 37 63-64 74 108 114-115 130-131 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399t 

Al\t\ Am Add d1 1- 4'^ 1 A'\A A'M 440 
462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 



TABLE 2 



S£Q 
ID 
NO: 


ACCESSION 
NUMBER 


SP£CIJ£S 


DKSCKlKriON 


SMTfH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


L06175 


Homo sapiens 


occurs in MHC class I region; ORF 


308 


98 


2 


Y70775 


Homo sapiens 


Follistatin-related protein z&ta. 




yo 


3 


X15187 


Homo sapiens 


precursor polypeptide (AA -21 to 
782) 


4112 


100 


4 


AFl 10640 


Homo salens 


orphan seven-transmembrane 
receptor 


344 


100 


5 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7879. 


158 


72 


6 


W85607 


Homo sapiens 


Secreted protein clone da228__6. 


1477 


100 


7 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 
nDKK4. 


oo*f 




8 


Y15227 


Homo sapiens 


Uul 


391 


100 


o 


V9SIR17 




pt326 4 secreted protein. 


3338 


100 


10 


X92106 


Homo sq)iens 


bleomycin hydrolase 


2445 


100 


11 


Y15228 


Homo salens 


Leu2 


445 


100 


12 


U27838 


Mus miisculus 


glycosyl-phosphatidyl-inositol- 
anchored protein hcnnolog 


432 


34 


13 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


. 320 


27 


14 


Y71062 


Homo sq>ien5 


Human membrane transport protein, 
MTRP-7. 


2323 


99 


15 


U96781 


Homo sapiens 


Ca2+ ATPase of fest-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 


5145 


100 


16 


M16653 


Homo sapiens 


pancreatic elastase IIB ^rmogen 


1435 


99 


17 


Y13398 


Homo ss^iens 


Amino acid sequence of protein 
PR0346. 


1749 


99 


18 


Y02283 


Homo sapiens 


Secreted protein clone br342_l 1 
polypeptide sequence. 


1399 


99 


19 


Y53030 


Homo sapiens 


Human secreted protein clone d24_l 
protein sequence SEQ ID NO:66. 


1371 


100 


20 


AL031320 


Homo sapiens 


dJ20N2.5 (novel protein similar to 
fucosidase, al^ha-L-l, tissue (EC 
32.1.51, alpha-l-fiicosidase 
fucohy(hx)lase)) 


2597 


99 


21 


B01384 


Homo sapiens 


Neuron-associated protein. 


1876 


100 


22 


Y68778 


Homo sq>iens 


Amino acid sequrace of a human 
phosphorylation effector FHSP-10. 


2470 


100 
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IBENTTTY 


23 


Y55935 


Homo salens 


Uiimnn 1^14^9 nrnfpin 
XlVUIlaU JVTikj.^ piUlCliJ. 


4781 


99 


24 


Y55935 


Homo sapiens 


riuman jShXiOi^ proiem. 


2807 


100 


25 


AC024792 


Caenoniaodius 
eleg&ns 


CQnisms snnuan^ lo i iv.w7 jv^7 


463 


31 


26 




TOT 

787 


xiuman sccreieci proiciii uagpiciiL 


1540 


100 


27 


X97630 


Homo sapiens 


seimvMiircoiuQc prt/icuj luuooc 


3781 


98 


28 


AF 150755 


Mus mus cuius 


m ICrDlllDUlC-aCQIl vll/SSllulvUJg lai^U/i 


3514 


68 


29 


AF 150755 


Mus musculus 


rniui tjnipmc"<ii#mi \ Jiiaoi ii uwmg ibvu/j 


3725 


70 


30 


Z380I1 


Mus musculus 


UJVLK-XNy 


2988 


86 


31 


AJ000522 


Homo sapiens 


axonemal dynein heavy chain 


6058 


99 


32 


AF037256 


Mus musculus 


Jdo^ protem 




91 


33 


S62140 


Homo s^i«is 


TLS=nuclear RNA -binding protein 


9017 


100 


34 


S62140 


Homo sapiens 


TLS=nuclear RNA-binding protein 


2890 


98 


36 


AB038237 


Homo sapiens 


G protem-coupled receptor C5L2 


1 /o/ 


inn 


37 


D79994 


Homo salens 


similar to ankyrin of Chromatixmi 
vinosum. 


0U07 




38 


X63380 


Homo sq>ien5 


serum response fector-related protein 


J.700 


00 


39 


ALQ22072 


Schizosacchar 
omyces pombe 


lipoic acid synthetase 


in#i7 

lUU/ 


V i 


40 


J03930 


Homo sapiens 


alkaline phosphatase 


2751 


100 


41 


AF132968 


Homo sapiens 


CGI-34 protein 


iUoO 


Oft 


42 


ALl 17637 


Homo sapiens 


hypothetical protein 


2208 


100 


43 


AL021393 


Homo sapiens 


bK747E2.1 (novel protein) 


1520 


lUU 


44 


X68011 


Homo sapiens 


ZNF81 


looO 


inn 


45 


AC002464 


Homo sapiens 


organic cation transporter; 50% 
smaOanty to JC4884 (PlD:g2l43oyz; 




inn 

lUU 


46 


W78245 


Homo salens 


Fragment of human secreted protein 

encoded by gene 19, 




inn 


47 


Y41765 


Homo sapiens 


Human PRO 1083 protein sequence. 




inn 


48 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; CLIC4 


1305 


99 


50 


U09413 


Homo s^iens 


zmc finger protem ZNr \5o 




57 


51 


AF061812 


Homo sqiiens 


keratin 16 




inn 


52 


W63681 


Homo sq>iens 


Human secreted protein 1. 


1326 


99 


53 


AB035303 


Homo salens 


cadherin-10 




inn 


54 


A12022 


synthetic 
construct 


MRP-8 




inn 


55 


AL121897 


Homo sapiens 


bA392Ml 8.3 (KIAAU i W) 


loo/ 


inn 


56 


Y73330 


Homo sapiens 


H'IKM Clone 3V /ooj proiem 

sequence. 


Rift 

OlO 


96 


57 


AF151018 


Homo sapiens 






100 


58 


AF125042 


Homo sapiens 


Dispnospnaie j -nucicouuasc 


♦ 1586 


100 


59 


AFl 18670 


Homo sapiens 


orpnan o proiein~coupicu rcvcpt-ux 


1971 


100 


60 


X04494 


Homo salens 


. precuiMu poiypcpuuc 


1903 


100 


61 


AF208865 


Homo s^iens 


EDRF 


528 


100 


62 


D15057 


Homo sapiens 


DAD-1 


567 


100 


63 


AF260665 


Homo s^iens 


histone acetyltransferase 


1510 


100 


64 


AF260665 


Homo sapiens 


histone acetyltransferase 


1429 


96 


65 


AJ277145 


Homo sapiens 


ras-related small u 1 Fase KA15 1 o 




inn 


66 


Y94950 


Homo salens 


Human secreted protein clone 
dhl073 12 protein sequence SEQ ID 
NG:106. 




inn 


67 


Y82744 


Homo salens 


DMA rq)lication and repair 
amtnciated nrotein fDRASPy 


1028 


100 


68 


Y44486 


Homo sapiens 


Human GPRW receptor polypeptide. 


1721 


100 


69 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protem BING4 
(similar to S. cCTevisiae YER082C, 
M sexta MNGIO and C. elegans 
F28D1.1) 


3196 


100 
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70 


AJ2763I6 


: 

Homo sapiens 




1751 


52 


71 


Y18314 


Homo sapiens 


Tl^l <tyilKpiII~llR.C Ul SFlVill 


4146 


99 


72 


AF157028 


Homo salens 


protein phosphatase methylesterase-1 


2017 


100 


74 


Y71082 


Homo sapiens 


Human ii'-agigrcsaivc lyuipuuiua 
(BAL)proteiDL 


1765 


99 


75 


AF225420 


Homo sapiens 


AjJuZj 


734 


100 


76 


X95235 


Homo s^iens 


* HI iTilinii ADO 

transdipuon lacior /vrz 


217 


100 


77 


AF108420 


Takifiigu 
nibripes 


1-aniinocyclopropane-carboxilate 
synthase 


733 


56 


78 


G01349 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5430. 


650 


99 


79 


ALl 17635 


Homo s^iens 


hypothetical protein 


922 


99 


81 


Z85986 


Homo sapiens 


dJ108K11.3 (similar to yeast 
suppressor protem oKr4Uj 




77 

— 


82 


AFl 83414 


Homo sapiens 


hemm-sensitive minauon lacior za 
]dnase 


^^9^1 

jZrJ 1 




83 


G01143 


Homo s^iens 


Human secreted protem, lu 
NO: 5224. 




98 


84 


U03985 


Homo sapiens 


N'-ethyfanalemude-sensitive fedor 


J f*r*r 


99 


85 


Y17791 


Homo s^iens 


VAX2 protein 




100 


87 


AF263538 


Homo sapirais 


growth dmerentiauon lacior j 


1944 


^ 


88 


Y19757 


Homo sapiens 


SKQ ID JNLi 4/5 trom w<jyyzzz<»j. 


IJ\JX 


100 


89 


AF161493 


Homo s^iens 


HSPC144 


1185 


100 


90 


AF161493 


Homo szqsiens 


HSPC144 




mo 

I. w 


91 


B25780 


787 


Human secreted protein SEQ ID 


647 


41 


92 


U57344 


Musmuscuhis 


Meis3 


lUU/ 


ftO 
oy 


93 


AF172854 


Homo sapiens 


cardiotrophin-like cytokine CLC 


1 1 OT 


Oft 


94 


AL390114 


Leisfamania 
major 


extremely cysteme/valine rich 
protein ' 


223 


29 


95 


AB016886 


Arabidopsis 
tfaaliana 


contains similarity to adenylate 
ldnase~gene id:MCA23.18 


xo/ 


JO 


96 


AC005525 


Homo sapiens 


F22162 1 




oS 


97 


B20997 


Homo sapiens 


Human nucleic acid-binding protein, 
NuABP-1. 


DODO 




98 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


507 


70 


99 


AF172264 


Homo ssq>iens 


Tra£2 and NCK interactmg kinase, 
splice variant 1 




yy 


100 


LI 1239 


Homo sapiens 


homeobox protein 


717 


100 


101 


AC004890 


Homo sapiens 


similar to zinc finger proteins; 
similar to AAC01956 
(PID:g2843171) 


01 ^4 


OR 

yo 


102 


AC003682 


Homo sapiens 


R28830 2 


1287 


48 


103 


AF201839 


Rflttus 
norvegicus 


dynamin liiDb isofonn 


497n 


95 


104 


Y79510 


Homo sapiens 


Human carbohydrate-associated 
protem i^ivD/vr-o. 


1394 


100 


105 


Y79510 


Homo sapiens 


Human carbohydrate-associated 
protem ckdAt-o. 


1209 


90 


106 


AL096748 


Homo sapiens 


hypothetical protein 


1216 


100 


108 


X97260 


Homo salens 


Metauotmonem z 


381 


100 


109 


AL034422 


Homo sapiens 


dJ1141E15.2 (novel jH-otein) 


433 


100 


110 


AF191338 


Homo sapiens 


anaphase-promoting complex subumt 
4 


fi»3 


100 


111 
111 


AT 091719 


Arahidonsis 

tfaaliana 


putative protein 


185 


26 


112 


AF250138 


Homo sapiens 


small stress protein-like protein 

HSP22 


1063 


100 


113 


AL109976 


Homo s^iens 


dJ794I6. 1 . 1 (novel protein) 


4176 


99 


114 


Y36151 


787 


Human secreted protein 


668 


100 



t29 
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115 


AFl 10399 


Homo sapiens 


— 1 ' — — 

elongation factor Ts 


1666 


100 


116 


AF210317 


Homo sapiens 


facilitative ghicose transporter family 
memoer ujLjU 


2052 


99 


117 


Y73328 


- 

Homo sapiens 


XI I KM Clone uoZ6*f J proicm 
sequence. 


931 


100 


118 


X04085 


Homo sapiens 


catalase 




100 


119 


AF147717 


Homo sapiens 


UDiquinn u-temunai nyoroiase 
UCH37 




100 


120 


X73882 


Homo sapi^is 


niicrotubule associated protein 




99 


121 


AC004882 


Homo sapiens 


similar to CAA16821 
(PID:g3255952) 


3223 


100 


122 


M93311 


Homo sapiens 


metallotfaionein-in 


421 


100 


123 


G03827 


Homo sapiens 


Human secreted protem, olivt U-' 
NO: 7908. 


DO / 




124 


G03827 


Homo sapiens 


. A J - A.^Z.^. O^T^^ 1 

Human secreted protein, S£Q ID 
NO: 7908. 




OJ 


125 


AF232009 


Homo sapiens 


peroxisomal tram 2-enoyl CoA 
reductase 


I30D 


GO 


126 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


127 


M60165 


Homo sapiens 


guanine nucleotide-binding 
regulatory protein 2 




GO 


128 


Y10319 


Homo sapiens 


carnitine canier 


1592 


100 


129 


U75467 


Drosophiia 
melanogaster 


Atu 


Vj / 


JO ' 


130 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 




o/ 


131 


Z21507 


Homo sapiens 


human elongation factor-l-delta 




inn 

iUU 


132 


Y58633 


Homo sapiens 


Protem regulating gene expression 
PRGE-26. 


6745 


100 


133 


Y58633 


Homo sapiens 


Protem regulating gene expression 
PRGE-26. 


4818 


95 


134 


M13692 


Homo sapiens 


alpha-1 acid glycoprotein precursor 


1064 


99 


135 


U72970 


Susscrofo 


calcium/cahnoduhn-dependent 
protein kinase II isoform gammarB 


27ZJ 




136 


G03213 


Homo s^iens 


Human seoreted protein^ bJby id 
NO: 7294- 




inn 


137 


AC005102 


Homo sapi^ 


small inducible cytokine subfamily A 
member 24 


627 


99 


138 


AF155648 


Homo sapiens 


pu^tive zmc finger protem 




07 


139 


AF144638 


Homo sapiens 


sphingosine-1 -phosphate lyase 


9077 
Z.J' / / 


100 


140 


AFI52318 


Homo sapiens 


protocadherin gamma Al 


4778 


100 


141 


B08517 


Homo sapiens 


Ammo acid sec]uence of a beta- 
mounn anngen. 




100 


142 


Xjooo/ 


Homo sapiens 


calretinm 


1410 


99 


143 




Homo sapiens 


t?^fft 


1605 


100 


144 


Y95zyj 


Homo sapiens 


XlUULiaH VjJj/A IXXUutUllUg i^JOJEWU&iW 
IririQCP GiiKcfTStfp clTNlir 


4092 


99 








GIC003 


1198 


100 


14o 




iiomo sopicus - 




554 


98 


147 


AJ272212 


Homo sapiens 


protein serine kinase 


2196 


100 


1 AO 

148 




riomo sapiens 




2114 


98 


149 


AB018580 


Homo sapi^is 


hluPGFS 


1699 


100 


150 


X91865 


Homo s^iens 


sncl 




100 






1V/fiic Ttiiicniliic 


n^eudfkiiridtne cvntha5JG 3 


2135 


84 


152 


U29170 


Drosophiia 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protein, S£Q ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 



130 
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155 


AF141315 


Homo sapiens 


alpha- 1,4-N- 

gCCty igiucoiKi III iny Auaiijicii <pp 


1842 


100 


156 


AFl 10645 


Homo sapiens 


canaiuaic lumor suppicsoUA 
INGl homolog 


1294 


99 


157 


AF159297 


Zeamays 


extensin**like protein 


238 


25 


158 


AL133325 


Homo salens 


uJyo4i'4.:> \^jtiomeoDox pioiem 
NKX2B) 


1437 


100 


159 


AF073298 


Homo sapiens 


small jciUKjs.*ncn laccor z 


294 


100 


160 


AC004858 


Homo sapiens 


U 1 sniaii noonucieoproiem i oin jvr 
nomoiogy mawn lo i^iL/.g^vjuuo / 


4032 


100 


161 


AB012109 


Homo sapiens 


APCIO 


990 


100 


162 


AL162751 


Arabidopsis 
tfaaliana 


putative protein 


194 


32 


163 


AJ005698 


Homo sapiens 


poly(A)-specific ribonuclease 




ion 


164 


AFl 17646 


Homo sapiens 


long CBL-3 protein 




00 


165 


AC0040Q2 


Homo sapiens 


similar to ciliary dynein beta heavy 
cnam^ 7o% oumianiy to rxjuyo 
(rlUzglloyo^^ 




1 w 


166 


M10942 


Homo salens 


human metalIothionein-*le 


JO x 


100 


167 


AF126484 


Homo s^iens 


CAKl/4 


4961 


100 


168 


AF161518 


Homo sapiens 


HorClo9 


l\}\rr 


100 


169 


M64983 


Homo salens 


fibrinogen beta dlain 


2482 


100 


170 


M64983 


Homo sapiens 


fibrinogen beta chain 




inn 


171 


M58514 


Gallus gallus 


fibrinogen beta chain 


1059 


78 


172 


AF078845 


Homo sapiens 


16.7Kd protein 


/oO 


1 nn 


173 


AC004774 


Homo sapiens 


Dlx-6 




inn 


174 


Z98974 


Schizosacchar 
omyces pombe 


putative vacuolar protein sorting- 
associated protein 


loD 


^ i 


175 


X56203 


Plasmodium 
falciparum 


liver stage antigen 






176 


W74726 


Homo sapiens 


Human secreted protem igy4y__3. 


1R70 


inn 


177 


AJ222967 


Homo sapiens 


cystinosin 


1920 


100 


178 


AC024796 


Caenoihabditis 
el^ans 


contains similarity to 1 ICU/O i o / 




97 


179 


Y66632 


Homo sapiens 


Membrane^DOuna protem JtKUz / o. 




inn 


180 


AF151803 


Homo sapiens 


CGI-45 protein 


215 


28 


181 


G02694 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6775. 


9!l'^ 


inn 


182 


Y17292 


Homo sapiens 


Human cell death preventing kinase 
(DPK-1) protein sequence. 


9#;7A 
Zu/D 


inn 


183 


AF234765 


lottos 
norvegicus 


serine-arginine-rich splicmg 
regulatory protem oivivr oo 


148 


27 


184 


AF151855 


Homo sapiens 


CGI-97 protein 


1214 


96 


185 


AF289664 


Mus musculus 


CYLN2 


4673 


90 


186 


AL022238 


Homo sapiens 


dJ 1 042KI QJ2 (supponea by 
OliNoUAN, rCjJlJNAo ana 

m7KTTJ"U7fQT7^ 
VJI^iN H W 1 ol^J 




inn 


187 


AL022238 


Homo sapiens 


dJ1042K102 (supported by 
vjcrlolJAJNy rKJcSHEni ana 
GENEWISE) 


2332 


100 


188 


X83543 


Homo sapiens 


APYT 


8513 


99 


189 


AF059569 


Homo sapiens 


actin binding protein MAYVEN 


3106 


99 


190 


M18135 


Rattus 
norvencus 


smoolh-muscle alpha tropomyosin 


unit 


yj 


191 


AF242194 


Drosopbila 
melanogaster 


brakeless-B 


147 


52 


192 


D30689 


Bacillus 
subtilis 


subunit of nitrite reductase 


113 


29 


193 


Y44984 


Homo sapiens 


Human epidermal i»xytein-l . 


538 


97 



131 . 
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BDENTTTV 


194 


B25679 


Homo sapiens 


Human secreted protein sequence 
encoGeu oy gene i -> oca^ ■u-' i^v-r.uo. 


760 


100 


195 


AB020315 


787 


nomoiogue oi mouse okk*! gcucAUi* 


1466 


100 


196 


U35730 


Mus musculus 


jeiky 


2021 


75 


197 


AL136450 


Homo sapiens 


dJ510O21.1 (novel protein) 


632 


• 100 


198 


X56203 


Plasmodium 
falciparmn 


liver stage antigen 


512 


24 


199 


Y70775 


Homo sapiens 


Foliistatm-reJaica protein zisia. 


2027 


63 


200 


X87237 


Homo sapiens 


a-glucosidase I 


4447 


99 


201 


AF101078 


Caenoitiabditi^ 
elegans 


CLU-1 


1393 


46 


202 


X04571 


Homo sapiens 


precursor polypeptide (AA -22 to 
1185) 


6611 


100 


203 


X00474 


Homo s£^iens 


pS2 precursor 


■too 


100 


204 


AB029333 


Halocyntbia 
roretzi 


HrPET-1 




S4 


205 


AF146019 


Homo sapiens 


r- n 

hepatocellular carcmoma antigen 

gene 520 


908 


100 


206 


AF071002 


Homo sapiens 


mijoK-related peptide 1 ; MiKr i 


MO 


100 


207 


AB038162 


Homo s^iens 


trefoil fector 2 


nAA 

/*!*•' 


100 


208 


U30521 


Homo salens 


P311 HUM 






209 


AB000911 


Sus scrofa 


ribosomai protein 


782 


100 


210 


AB02I227 


Homo sapiens 


membrane-type-5 matrix 
metalloproteinase 




inn 


211 


AF180920 


Homo sapiens 


cycliti L ania-6a 




i.vFvl 


212 


AF105365 


Homo sapiens 


K~C1 cotransporter KCC4 




inn 


213 


U29244 


Caenoiiabditis 
elegans 


similar to human (^ i Kb^ transionmng 
protem (PIR:o22 id/) 


ouz 




214 


AL033538 


Homo ss^iens 


aJ477n23.1 (novel protem^ 




100 


215 


X52011 


Homo s^iens 


muscle determination fector 


1262 


100 


216 


AF083248 


Homo sapiens 


ribosomai protein L26 homolog 


/ 


1 no 


217 


AF006751 


Homo sapiens 


T^rt /tort 

ES/130 






218 


AB007859 


Homo sapiens 


KIAAU399 protem 




99 


219 


AK026291 


Homo sapiens 


unnamed protein irodact 




100 


221 


¥84045 


Homo sapiens 


Splice variant of cancer associated 
polypeptide Cxi i -i^a 1 1 -z- 


5851 


97 


222 


Z67996 


Homo sapiens 


t^ascin-R (restrictin) 


/ AOVJ 


100 


223 


AF134802 


Homo sapiens 


cofilin isofcMrm 1 


846 


100 


224 


Y17711 


Homo sapiens 


atopy related autoantigen wVLu 


1^11 




225 


AF190051 


Gallus gallus 


hepatocyte nuclear factor la 
uimenzanon coiacior isoioim 


AA1 


SI 


226 


AK026256 


Homo sapiens 


unnamed protein product 


866 


98 


227 


Z69368 


Schizosaccliar 
omyces pombe 


-nuiZ'^liKe coiiea-cou pruvem 


230 


25 


228 


AF275948 


Homo s£^iens 


A or* A 1 


11763 


99 


229 


AFlol384 


Homo sapiens 




2006 


98 


230 


Y16270 


Homo sapiens 


paralemln 


1951 


100 


231 


AJ245599 


Homo salens 


pucanve secreiea ugduu 


2379 


99 


232 


W88499 


Homo s^iens 


Human stomach carcinoma clone 
rir 1 U4 i z-encooea proiem. 


1545 


99 


233 


AF096286 


.. ^ . — 
MusmuscQlus 


pecanex 1 


3623 


93 


234 


V64619_cd 
1 

X 


Homo sapiens 


3U-WUV-15^u xiuman run cx^ina. 


796 


100 


235 


V64619 cd 
1 


Homo sapiens 


30-NOV-1990 Human HEl cDNA. 


470 


98 


236 


AP227258 


Bostaums 


RPGR-interacting protein-1 


1262 


38 


237 


AJ132445 


Homo sapiens 


claudin-14 


1181 


100 


238 


AL034562 


Homo sapiens 


dJ684024JI (prodynoipliin(Beta- 


1330 


100 
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Neoendoiphin-Dynoiphin precursor, 
Proenkephalin B precursor)) 






239 


AF262027 


Homo sapiens 


elF-5A2 


808 


100 


240 


AL079344 


Arabidopsis 
thaliana 


putative protein 




DJ 


241 


AC002394 


Homoss))!^ 


Gene product with snnilanty to 
dynein beta subimit 


1542 


f 1 


242 


AJ271361 


Takifugu 
nibripes 


FRANK2 protein 


303 


3U 


243 


AL021918 


Homo sapiens 


b34I8.1 (Kiuppel related ZmcFmger 
protein 184) 


1476 


48 


244 


AF190167 


Homo sapiens 


membrane associated protein SLP-2 


1736 


99 


245 


Y10601 


Homo sapiois 


ankyrin-like protein 


5877 


100 


246 


AL121771 


Homo sapiens 


dJ548G19.1.1 (novel protem 
(ortiiolog of mouse zinc finger 
protein ZFP64) (translation of cDNA 
NT2RP3001398 (Em.AK001596)) 
(isoform 1)) 


3628 


100 


247 


L25314 


Drosophila 
melanogaster 


actin-related protein 


984 


47 


248 


X63745 


Homo sapiens 


KDEL receptor 


1095 


100 


249 


AFl 12208 


Homo sapiens 


13kDa differentiation-associated 
protein 


816 


100 


250 


AP001707 


Homo sapiens 


human gene for claudin-8. Accession 
NO.AJ250711 


1172 


100 


251 


AL136125 


Homo sapiens 


dJ304B14.1 (novel protein) 


778 


100 


252 


AL031186 


Homo sapiens 


bK984Gl.l (supported by FGENES) 


532 


100 


253 


Y17531 


Homo sapiens 


Human secreted protein clone BL205 
14 protein. 


639 


100 


254 


AL049843 


Homo sapiens 


dJ392M17.3 (KIAA0349 protein) 


6741 


99 


255 


AJ242972 


Homo sapiens 


TOLLIP protein 


1424 


99 


256 


Y94873 


Homo sapiens 


Human protein clone HP02632. 


1876 


lUO 


257 


AP279865 


Homo sapiens 


kmesin-like protein G AKIN 


2903 


1 AA 
100 


258 


AL024498 


Homo sapiens 


dJ417M14.1 (novel protein) 


589 


1 AA 
100 


259 


R66278 


Homo sapiens 


Hier^utic polypeptide from 
glioblastoma cell line. 


830 


100 


260 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-IkappaB 


3226 


99 


261 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-IkappaB 


2821 


1 AA 
100 


262 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-Ik^paB 


3149 


99 


263 


AF197060 


Homo sq>iens 


src homology 3 domain-containing 
protein HIP-55 


2257 


inn 
lOO 


264 


Y86262 


Homo sapiens 


Human secreted protein HAQAR23, 
SEQ ID NO: 177. 


loo 


inn 
lUO 


265 


Y56966. 


Homo sapiens 


Hmnan SBPSAPL polypeptide. 




inn 


266 


Y56966 


Homo sapiens 


Human SBPSAPL polypeptide. 


lUlo 


00 


267 


AJ300465 


Homo sapiens 


putative white family ATP-binding 
cassette transporter 




7J 


268 


AC004030 


Homo sapiens 


F21856 2 






269 


X55954 


Homo sapiens 


HL23 ribosomal protein 


714 


100 


270 


AB033921 


Mus muscuhis 


Ndrl related protem Ndr2 




QA 


271 


AF081886 


Homo sapiens 


EROl-like protein 


iWj 


OO 

yy 


272 


AF166492 


Homo sapiens 


small GTPase RAB6B 


1060 


100 


273 


AL022238 


Homo sapimis 


dJ1042K10.4 (novel protein) 


2201 


inn 


274 


W88667 


Homo sapiens 


occreieu pruicm cncoucu uy gcuc 
134 clone HAIBP89. 


1530 


99 


275 


X00129 


Homo sapiens 


precursor RBP 


1044 


97 


276 


Z47500_cdl 


Homo sapiens 


1 l-MAY-1998 Human RHOH gene 
sequQdce. 


1161 


100 


277 


AB049188 


Eqinis caballus 


ubiquitm Otenninal hydrolase 


1118 


96 
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278 


AF270647 


Homo sapiens 


(j1 1 1 


1564 


100 


279 


AF143956 


Musmusculus 


coronin-2 


2414 


94 


280 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


911 


92 


281 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


1031 


100 


282 


D83948 


Rattus 
Dorvegiciis 


Sl'l protein 




on 
yu 


283 


Y14768 


Homo sapiens 


1 Kappa B-like protem 


^\mO f 


inn 


286 


AL031316 


Homo sapiens 


dJ28O103(HSDllBl 
(hydroxysteroid (1 l-beta; 
dehydrogenase 1) 


294 


100 


. 287 


D64109 


Homo sapi^ 


tob family 




yy 


288 


AB026043 


Homo sapiens 


MS4A7 


1230 


100 


289 


M61866 


Homo sapiens 


Knieppel-related DNA-binding 
protein 


209 


yu 


290 


AJ001810 


Homo sapiens 


mRNA cleavage factor 1 25 kDa 
subunit 


1217 


lUU 


291 


Y99454 


Homo sapiens 


Himian PRO1605 (UNQ786) amino 
acid sequence SEQ ID NO:395. 


oy4 


inn 
lUU 


292 


Y44824 


Homo sapiens 


Human molecule associated with cell 
proliferation, MACP-4. 


2370 


1 t\t\ 


293 


AJ276101 


Homo sapiens 


GPRC5B protem 


zOyy 


inn 


294 


AF161406 


Homo sapiens 


HSPC288 


719 


100 


295 


y58628 


Homo sq)iens 


Protein regulating g^e e3q>ression 
PRGE^2L 


1276 


100 


296 


U91561 


Rattus 
norvegicus 


pyridoxine 5*-phosphate oxidase 


1239 


o7 


297 


L02956 


Xenopus 
laevis 


libonucleoprotein 


1624 


o3 


298 


AF226730 


Homo sapiens 


Cytl9 


1729 


nn 

yy 


299 


AF226730 


Homo sapiens 


Cytl9 


906 


98 


300 


Y54324 


Homo sapiens 


Amino acid sequence of a human 
gastric cancer antigen protein. 


718 


on 

©y 


301 


AF125533 


Homo sapiens 


N ADH-cytochrome b5 reductase 
isoform 


1606 


100 


302 


y32206 


Homo sapiens 


Human receptor molecule (RBC) 
encoded by Incyte clone 2825826. 


lovo 


OB 

y© 


303 


AF247565 


Homo sapiens 


hepatocellular carcinoma associated 
ring finger protein 


525 


inn 


304 


AF208844 


Homo sapiens 


BM-002 


AO 8 


inn 


305 


AC004983 


Homo salens 


sunilar to PID:g3877944 


1988 


100 


306 


AL132978 


Arabidopsis 
thaliana 


putative protein 






307 


Y 10530 


Homo sapiens 


olfactoiy receptor 


1645 


100 


308 


AFl 80681 


Homo sapiens 


guanine nucleotide exchange factor 




inn 


309 


AF111856 


Homo sapiens 


sodium dependent phosphate 
transporter isofonn NaPi-3b 




QQ 


310 


Y13583 


Homo sapiens 


G-protein coupled receptor 


9171 


100 


311 


Z73420 


Homo sf^iens 


cE14oD10.2 (mercaptopyruvate . 
suliiirtiansferase (EC 2.8.1.2)) 




100 


312 


X79535 


Homo sapiens 


beta tubulin 




100 


313 


AF070658 


Homo sapiens 


HSPC002 


861 


100 


314 


AF078866 


Homo sapiens 


SURF-4 




inn 


317 


Z37986 


Homo sapiens 


phenylalkylamine binding protein 




inn 


32Q 




Ail O O 

^cicularis 


hvnntheHcfll nrotein 


258 


82 


321 


Y25755 


Homo sapiens 


Human secreted protem encoded 
from gene 45. 


1440 


100 


322 


AB016531 


Homo sapi^ 


PEX16 


1741 


100 


323 


AU91141 


Arabidopsis 


putative protein - 


274 


49 
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Ilia U 111 lo 








325 


A1*1403U1 


riomo siipiciib 


DMA rmlvmerase iota 


3691 


99 


326 


X96698 


Homo sapiens 


D1075-like 


1450 


96 


327 


AFl 52325 


Homo sapiens 


pr vJiUCaUIlCi ill gqmiiia txj 


4769 


100 


328 


AF151803 


Homo sapiens 


CGI-45 protein 


1970 


100 


329 


X74070 


Homo sapiens 


uoiiscripuiin louiui o xcj 


639 


81 


330 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


331 


W54040 


Homo sq)iens 


Human mteneron-mouciDic proiciii, 
HIFI. 


484 


98 


332 


AF024617 


Homo sapiens 


transcription-associated zinc ribbon 
protein 


691 


100 


333 


U19181 


Rattus 
norvegicus 


Rabin3 


9190 


90 


334 


G03877 


Homo salens 


Human secreted proiem, nJ 

NO: 7956. 




100 


335 


AL008582 


Homo sq)iens 


bK223H92 (ortholog of A. thaliana 
F23F1.8) 


626 


100 


336 


AF110774 


Homo sapiens 


adrenal glana protem almiui 


OH / 


100 


337 


AB011414 


Homo sapiens 


Kruppe J-type zmc tmger protem 


1674 


58 


338 


AE207600 


Homo sapiens 


e&anolamtne kinase 


129 


100 


340 


AC020579 


Arabidopsis 
tfaaliana 


putative 

phosphoribosylfonnylglycinamidine 

^cct\€\ onA<A 

synthase; 25509-29950 




SO 


341 


Y28576 


Homo sapiens 


Secreted peptide clone pe503_L 


944 


100 


342 


U32274 


Saccharomyce 
s cerevisiae 


Ydr386wp;CAI: 0.12 

■ 


191 


37 


343 


A01771 


synthetic 
construct 


vascular anticoagulating protein 


1001 


00 


344 


AF22P052 


Homo sapiens 


uncharacterized hematopoietic 
stem/progenitor cells protein 
MDS032 




ion 


345 


Y70400 


Homo sapiens 


Human ceU*signalling protein'2. 


7S4 

f «JH 


100 


346 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6 1 derived protein. 




100 


347 


AFl 83428 


Homosq)iens 


28.4 kDa protein 




100 


348 


AC006069 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specifity factor 


1383 


55 


349 


AL032631 


Caenorfaabditis 
elegans 


Y106G6H.8 


194 


39 


350 


U70669 


Homo sapiens 


r — J ' — A — 5 

Fas-ugand associated factor o 


167 


23 


351 


Y93468 


Homo sapicins 


Ammo acid sequence oi a poiassiuin 
channel interactor protein. 


1182 


92 


352 


AF005856 


Drosopmia 
yakuba 


anonzA^ 


111 


45 


353 


AJ271684 


Homo sapiens 




1013 


100 


354 


AF099100 


Homo sapiens 


WD-repeat protein 6 


2882 


99 


355 


U51730 


Murine 

leukemia virus 


reverse transQiptase 


316 


42 


356 


D50617 


Saccharomyce 
s cerevisiae 


VT7I (\Aor* 


279 


27 


357 


D50617 


Saccharomyce 
s cerevisiae 


YFL042C 


279 


27 


358 


AFl 61432 


Homo sapiens 


HSPC314 


1059 


93 




AB029488 


Homo sapiens 


Cllor£21 


758 


99 


360 


AJ251024 


Homo salens 


putative odorant binding protein ag 


1239 


100 


361 


U43281 


Saccharomyce 
5 cerevisiae 


LpgZ^ 


2074 


74 


362 


U43281 


Saccharomyce 
sc^evisiae 


Lpg22p 


2153 


74 
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363 


AC007153 


Arabidopsis 




156 


24 


364 


AF197927 


Homo sapiens 


AF5q31 protein 


3992 


99 


365 


D28500 


Homo s^iens 


muOCnonOJriai isoieucinc uvin/a 

synthetase 


4286 


98 


366 


X97868 


Homo sapiois 


aryisTi ipnaTBse 


3141 


98 


367 


AL162048 


Homo sapiens 


hypothetical protein 


1532 


100 


368 


L360^ 


Mus musculus 


steroiQOgemc acuie leguiauiiy 
protein 


189 


25 


369 


AFl 13249 


Homo sapiens 


multiple domam pmanvc nuciear 
protein 


1022 


59 


370 


M15888 


Bos taunis 


6nQ0Z6piDC~rCiaiCU piuicm pi^vtu^v* 


2425 


84 


371 


X66363 


Homo sapiens 


serine^direonine protein kinase 


2562 


100 


372 


W74802 


Homo sapiens 


Human se^^eted proiem encouea oy 

«7<3 ^1-.—^ 1TQOT7T 'JK 

gene 73 clone nov^JoJLz?. 


1532 


89 


373 


AF100772 


Homo sapiens 


tenascm-Ml 


11535 


99 


374 


. AF090934 


Homo sq)iens 


PRO0518 


382 


100 


375 


AB021643 


Homo sq)i^ 


gonadotropm induCaDie iianscripiioii 
repressor-3 


5761 


99- 


376 


AB049758 


Homo s^iens 


MAWD binding protein 


1331 


100 


377 


AF070666 


Homo si^iens 


Knippel-associated box protein 


*rOO 


97 


378 


S59342 


Mus sp. 


nuclear pore complex glycoprotein 
p62 




60 


379 


AF149205 


Mus musculus 


Su(var)3-9 homolog Suv39h2 


1690 


88 


380 


AF227906 


Homo sapiens 


UDP-ghicose:glycoprotein 
glucosyttransferase 2 precursor 




99 


381 


AFl 18566 


Mus musculus 


hematopoietic zinc finger protein 




7^ 


382 


AK000619 


Homo sapiens 


unnamed protein product 


810 


100 


383 


AF227906 


Homo sapiens 


UDP-glucose:glycoprotein 
glucosyltransferase 2 precursor 


/oDl 


QO 


384. 


AF117946 


Homo sapiw 


Link guanine nucleotide exdiange 
fectorll 


2363 


100 


385 


AF125390 


Drosophila 
melanogaster 


L82G 


139 


41 


386 


Y94907 


Homo sapiens 


Human secreted protein clone. 
cal06 19x protem sequence i>c\i lU 
NO:20. 


1092 


50 


387 


U18795 


Saccharomyce 
scerevisiae 


Yel064cp 




28 


388 


AF177388 


Homo sapi^ 


cancer-amplified transcriptional 
coactivator ASC-2 






389 


AJ002744 


Homo sapiens 


UDr-GalNAc;polypeptiae n- 
acetylgalactosaminyhransferase 7 


3469 


96 


390 


AF097366 


Homo sapiens 


cone sodium-calcium potassium 
exchanger 


3166 


100 


391 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
moiecuie 


5337 


60 


392 


U8I035 


Rattus 
norvegicus 


anxynn Dmoing ceu auucsitw 
moiecuic uciuuiwi'Ui 


3967 


91 


393 


X65224 


Galhis gallus 


neuniiascui 


4097 


78 


394 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 


4292 


99 


395 


AF151083 


Homo sapiens 


HSPC249 


444 


98 




A RO 17026 


Mus musculus 


oxysterol-binding protein 


2173 


98 


397 


AL035587 


Homo sapiens 


dJ475N16.4 (KIAA0240) 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 
gene 85 clone HSDFV29. 


722 


92 


399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 
(HYDRI^8). 


1637 


99 
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400 


AF039718 


CaeDOrfaabditis 
elegans 


contains similanty to lupus LA 
protein homologs 


325 


43 


401 


AE000877 


MetliaDotiienn 
obacter 
thermoautotro 
phicus 


conserved i^otein 


231 


36 


402 


Y27795 


Homo sapiens 


Human secreted protein encoded by 
ficneNo.79, 


1 coo 
1539 


5/y 


403 


Z50853 


Homo sapiens 


CLPP 


OiD 


ion 


405 


X03475 


Rattus 
norvegicus 


nbosomal protem L35a (aa 1-1 10) 


D/0 




406 


AF144237 


Homo s^iens 


LOMP protein 




AA 


407 


U20239 


Musmusculus 


fibrosin 


288 


76 


409 


AL033378 


Homo sapiens 


dJ323M4.1 (KIAA0790 protein) 


oOZo 




410 


X54326 


Homo sapiens 


ghitaminyl-tRNA syntiietase 


Ibll 


yy 


411 


X61585 


Bos taurus 


polynucleotide adenylyltransferase 


3715 


97 


412 


AF217190 


Homo sapiens 


MLELl protein 


5271 


99 


414 


G02815 


Homo s^iens 


Human secreted protein, SEQ ID 
NO: 6896. 


314 


95 


415 


AJ245922 


Homo sapiens 


alpha-tubulin 8 


2370 


100 


416 


AF203032 


Homo sapiens 


neurofilament protein 


220 


21 


417 


Z97653 


Homo sapiens 


C380A1 J2.1 (novel protein (isofonn 

D) 


1567 


100 


418 


AJ404326 


Homo sapiens 


SR+89 


1871 


99 


419 


AJ404326 


Homo sapiois 


SR+89 


902 


64 


420 


AF134726 


Homo sapiens 


G9A 


5334 


99 


421 


L28125 


Podospora 
anserina 


beta transducin-iike protein 


288 


39 


422 


W21733 


Homo sq)iens 


NIP-1 encoded by clone 59. 


110 


. 72 

n£ 


423 


S67970 


Homo s^iens 


ZNF75=KRAB zinc finger 


951 


76 


424 


L28035 


Mus musculus 


protein kinase C gamma 


3768 


98 


426 


.Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence. 


555 


56 


427 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence. 


266 


AQ 


428 


X61118 


Homo sapiens 


TTG-2a/RBTN-2a 


876 


iUU 


429 


Z96932 


Homo sq}iens 


nuclear autoantigen fo 14 kDa 


496 


83 


430 


AI277291 


Homo sapiens 


HELG protein 


678 


72 


431 


X82157 


Homo sapiens 


hevin 


3525 


99 


432 


AC007192 


Homo sapiens 


P85B JIUMAN; PTDINS-3- 
KINASEP85-BETA 


3825 


99 


433 


AL021918 


Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protein 184) 


1713 


50 


434 


AF084464 


Rattus 
norvegicus 


GTP-binding protein REM2 


141 


29 


435 


AL049795 


Homo sapi^ 


dJ622L5^ (novel protein) 


1756 


98 


436 


M14513 


Rattus 
norvegicus 


(Na+ and K+) ATPase, alpha(III) 
catalytic subunit 


4269 


99 


437 


U33460 


Homo sapiens 


DNA-directed RNA polymerase I, 
largest subunit 


8777 


9o 


438 


D87076 


Homo sapiens 


similar to human bromodomain 
protein BR140(JC2069) 


3Uo/ 


inn 

iUU 


439 


L43912 


Macaca 
mulatta 


mannose-binding protein A 


589 


93 


440 


D31763 


Homo sapiens 


ha0946 protein is Kruppel-related. 


927 


49 


441 


U70976 


Homo s^iens 


arrestin 


2068 


99 


442 


B08069 


Homo ssq)iens 


A human beta-alanine-pynivate 
aminotransferase (HAPA). 


2343 


99 


443 


AF10066? 


Caenoibabditis 


contains similarity to ubiquitin 


166 


24 
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elegans 


carboxyl-terminal hydrolase (Pfem: 
UCH-l.hmm, score: 28.46) (Pfem: 
UCH-2 Jmim, score: 47 J3) 






444 


D78017 


Rattus 
norvegicus 




2667 


no 


445 


AL049569 


Homo sapiens 


dJ37C103 (novel ATPase) 


2418 


lUU 


448 


AJ242540 


Volvox caiteri 
f. nagariensis 


hydroxyproline-rich glycoprotein 
DZ-HRGP 


165 


J** 


449 


AJ133352 


Homo sapiens 


ZNE237 protem 


2U(I0 


inn 


450 


AJ133352 


Homo s^iens 


ZNF237 protein 


1025 


96 


451 


AF170708 


Homo sapiens 


T-box protein TBX3 


3700 


99 


452 


AK002080 


Homo sapiens 


unnamed protein product 


1546 


fin 
99 


453 


U2977 


Homo sapiens 


Rieske Fe-S protein 


1239 


93 


454 


X51760 


Homo sapiens 


2mc finger protein (583 AA) 


1533 


57 


455 


Y01141 


Homo sapiens 


Secreted protein encoded by gene 7 
clone HTLFA90. 


1453 


99 


456 


AB006631 


Homo ss^iens 


Ibe human homolog of mouse Cux-2 


6559 


100 


457 


AF067165 


Homo sapiens 


zinc finger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 
gene 19 clone HRSMC69. 


1180 


95 


460 


U970Q2 


Caenorfaabditis 
elegans 


similar to acyl-CoA dehydrogenases 
and epoxide hydrolases; Pfem 
domain PF00441 (Acyl-CoA_dhX 
Score=57.4, &-value«1.7e-16, N=2; 
contains similarity to Pfem domain 
PF00702 (Hydrolase), Score=57.4, 
B.value=le-13,N-1 


583 


37 


461 


AK023114 


Homo sapiens 


unnamed protein product 


1041 


99 


462 


M93134 


Friend murine 
leukemia virus 


pol protein 


289 


44 


463 


AF055473 


Homo sapiens 


GAGE-8 


232 


47 


466 


Y51415 


Homo sapiens 


Human wild type pKe83 protein. 


2625 


100 


467 


Y51417 


787 


Human pKe83 splice variant protein 


2433 


100 


468 


Y57936 


Homo sq)iens 


Human transmembrane protein 
HTMPN-60. 


1629 


96 


469 


D38552 


Homo sapiens 


The hal539 protein is related to 
cyclophilin. 


2995 


100 


470 


Y70013 


Homo sapiens 


Human Protease and associated 
protein-7 (PPRG-7). 


3530 


100 


471 


AJ224747 


Homo sapiens 


C-terminal variant of hINADL 
including 2 amino acid exchanges 
and an insertion of 28 amino acids in 
fiame. 


7969 


100 


472 


W99665 


Homo sapiens 


Human secreted protein done 
dul57 12 protem. 


1546 


100 


473 


W99665 


Homo sapiens 


Human secreted protein clone 
dul57 12 protein. 


998 


no 

98 


474 


X63526 


Homo sapiens 


homologue to elongation fector 1- 
gamma fit)m A.salina 


2273 


on 
99 


475 


X15940 


Homo sapiens 


ribosomal protein L31 (AA 1-125) 


644 


100 


476 


M60832 


Homo sapiens 


alphar2 type Vin collagen 


3581 


99 


477 


AF039697 


Homo sapiens 


antigen NY-CO-31 


1213 


97 


478 


AF156929 


Sus scrofe 


inflammatory response protein 6 






479 


AF264717 


Homo sapiens 


FYVH domain-containing dual 
specificity protein phosphatase 
FYVE-DSP2 


5610 


99 


480 


AF044578 


Homo sapiens 


putative DNA polymerase; POMP 


2478 


94 


481 


X89750 


Homo sapiens 


TGIF protein 


1413 


100 



138 
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ID 
NO: 


NUMBER 


SPECTESi 


DESCRIPTION 


SMfiU- 
WATERMAN 
SCORE 


% 

IDENTTry 


492 


M93107 


Homo sapiens 


{R>3-hydroxybutyrate 
dehydrogenase 


1663 


96 


483 


U58334 


Homo sapiens 


Bbp/53BP2 


1556 


41 


484 


AF151538 


Homo sapiens 


deoxycytidyl transferase; Revlp 


4281 


99 


485 


Z98884 


Homo sapiens 


dJ467LLl (K1AA0833) 


699 


73 


486 


AJ243874 


Homo sapiens 


oligophn»iin-4 


3682 


100 


487 


Z11737 


Homo sq>ien5 


flavin-containing monooxygenase 4 


2969 


100 


488 


X56123 


M us musculus 


talin 


4353 


77 


489 


AJ278112 


Homo sapiens 


putative cell cycle control protein 


335 


23 


490 


W74843 


Homo sapiens 


Human secreted protein encoded by 
gene 1 15 clone HOVBA03. 


1013 


98 


491 


Y41337 


Homo sapiens 


Human secreted protein encoded by 
gene 30 clone HRDDV47. 


509 


36 


492 


X90530 


Homo sapiens 


ragB 


1926 


99 


493 


X90530 


Homo sapiens 


ragB 


1405 


99 


494 


X90530 


Homo sapiens 


ragB 


1893 


96 


495 


ALQ22394 


Homo sq>iais 


dJ511B24.3 (KIAA0395 (probable 
hbmeobox protein)) 


4990 


99 


496 


Y11395 


Homo s^iens 


lantfaionine syn&etase C-like protein 
1 


2168 


100 


497 


AJ010119 


Homo sapiens 


Ribosomal protein kinase B (RSK-B) 


4001 


100 


498 


G01563 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5644. 


330 


100 


499 


X54131 


Homo sapiens 


protein-tyrosine phosphatase 


10465 


99 


500 


G01082 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5163. 


549 


100 


501 


AC004142 


Homo sapiens 


similar to murine leucine-rich repeat 
protein; possible role in neural 
development by protein-protein 
interactions; 93% similarity to 
D49802 (PID:gl369906) 


3676 


100 


502 


ALl 17544 


Homo sapiens 


hypothetical protein 


1226 


100 


503 


AF203032 


Homo sapiens 


neurofilament protein 


5115 


99 


504 


AL034417 


Homo sapiens 


bK21 5D1 1.2 (similar to rat gene 33) 


2476 


100 


505 


X69090 


Homo sapiens 


190kD protein 


7546 


99 


506 


U58755 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
yk34bL5; coded for by C. elegans 
cDNA ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2 J; coded for 
by C. elegans cDNA yk46d53; 
coded for by C. elegans cDNA 
ykl3fl0.3; coded for by C elegans 
cDNAyk34bl.3 


782 


55 


507 


AJ293309 


Homo sapiens 


NHP2 protein 


801 


100 


508 


U39045 


Rattus 
norvegicus 


cytoplasmic dynein intermediate 
chain 2B 


3241 


97 


509 


AF063231 


Mus musculus 


cytoplasmic dynein intermediate 
chain 2 


3159 


97 


510 


AF202893 


Mus musculus 


KifZlb 


43Jo 


yO 


511 


Y13115 


Homo sapiens 


serine/threonine protein kinase 


5071 


99 


512 


AB030207 


Homo sapiens 


G gamma subunit 


364 


100 


513 


AF039571 


Homo sapiens 


per^heral benzodiazepine receptor 
interacting protein; PBR-IP/PRAXl 


495 


33 


514 


AB037883 


Homo s^iens 


Gb3/CI>77 synthase 


1916 


99 



139 
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ID 

NO: 


iVUiVIBER 

j 


— SPFriES 




WATERMAN 
SCORE 


/O 

IDENTITY 


515 


D90868 


\Escherichia 
coli 


similar to 


1489 


100 


516 


X98834 


Homo sapiens 


zinc finger protein Hsal2 


5290 


100 


517 


AF055668 


Mus musculus 


apoptosis-linked gene 4, deltaC form 


2904 


78 


518 


AFO 19926 


Mus musculus 


ptotein kinase 


1694 


90 


519 


M34513 


Homo sapiens 


omega protein 


317 


91 


520 


Y08612 


Homo sapiens 


88kDa nuclear pore complex protein 


2313 


99 


521 


Y08612 


Homo sapiens 


88kDa nuclear pore complex protein 


1561 


99 


522 


AL096766 


Homo sapiens 


dA59H18.1 (KIAA0767 protein) 


2497 


100 


523 


AFl 86249 


Homo sapiens 


six transmembrane epithelial antigen 
of prostate 


1790 


100 


524 


AB029012 


Homo sapiens 


lOAA 1089 protein 


4933 


100 


525 


AB026893 


Homo sapiens 


vascular cadherin-2 


5962 


100 


526 


X74331 


Homo sapiens 


DNA primase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 


1488 


47 


529 


X14830 


Homo sapiens 


acetylcholine receptor beta-subunit 
preprotein 


2639 


100 


530 


U80446 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
ykl72e6.3; coded for by C. elegans 
cDNA ykl58f7.3; coded for by C. 
elegans cDNA ykl58f7.5; coded for 
by C. elegans cDNA ykl72e6.5 


420 


39 


531 


S76838 


Mus sp. 


Dbs 


4821 


88 


532 


Z82215 


Homo sapiens 


dJ6802.2 (myosin, heavy 
polypeptide 9, non-muscle) 


9828 


100 


533 


AF245505 


Homo sapiens 


adlican 


277 


31 


534 


AF300612 


Homo sapiens 


N-acetylgalactosamine-4-0- 
sulfotransferase 


993 


59 


535 


AL121928 


Homo sapiens 


bAl 8114.3 (pleckstrin and Sec7 
domain protein) 


3333 


99 


536 


AJ271055 


Mus musculus 


iroquois homeobox protein 6 


1724 


76 


537 


AF180473 


Homo sapiens 


Not2p 


2267 


100 


538 


AF071059 


Mus musculus 


zinc finger RNA binding protem 


1089 


• 51 


539 


AF023453 


Homo sapiens 


actin-related protein 3-beta 


2219 


100 


540 


AC003O3O 


Homo sapiens 


R29828__l 


1401 


70 


541 


AC003O3O 


Homo sapiens 


R29828J 


2294 


100 


542 


AL121889 


Homo sq)iens 


dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803)) 


2152 


100 


. 543 


AB006135 


Rattus 

norvegicus 


db83 


1238 


98 


544 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6731. 


644 


97 


545 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


546 


AL133545 


Homo sapiens 


bA386N14.1 (novel protein similar 
to a dual specificity phosphatase) 


964 


99 


547 


X83618 


Homo sapiens 


hydroxymethylglutaryl-CoA 
synthase 


2647 


100 


548 


AF134726 


Homo sapiens 


NG37 


4359 


99 


549 


AB035356 


Homo sapiens 


neurexin I-alpha protein 


6948 


99 


551 


AB037901 


Homo sapiens 


gene amplified in squamous cell 
carcinoma- 1 


5215 


99 


552 


AB043634 


Homo sapiens 


PAR-6A 


885 


100 


553 


AP000693 


Homo sapiens 


partial CDS 


4875 


99 


554 


AF002223 


Homo sapiens 


myotubularin related 1 


3490 


100 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 (KIA0093); 
similar to P46934 (PID:gl 171682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


8328 


100 


557 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


11137 


100 



140 
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ID 

NO: 


ACCESSION 
NUMBERS 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


558 


X65873 


Homo sapiens 


kinesin heavy chain 


4860 


100 


559 


AJ277365 


Homo sapiens 


polyglutamine-containing protein 


592 


36 


560 


AF205600 


Homo sapiens 


transposase-like protein 


407 


27 


561 


• X71 125 


Homo sapiens 


glutaminyl-peptide cyclotransferase 


1914 


100 . 


562 


X71125 


Homo sapiens 


glutaminyl-peptide cyclotransferase 


1456 


97 


563 


X54304 


Homo sapiens 


myosin regulatory light chain 


897 


100 


564 


AF250842 


Drosophila 
melanogaster 


multiple asters 


130 


23 


565 


Y58608 


Homo s^iens 


Protein regulating gene expression 
PRGE-1. 


1619 


99 


566 


AL121893 


Homo sapiens 


bA189K21.5 (novel protein similar 
to retinoblastoma binding protein 
(RBBP9)) 


1012 


100 


567 


AL 117352 


Homo sapiens 


dJ876B10.2 (novel protein (ortholog 
ofratEX084)) 


3713 


99 


568 


AF228603 


Homo sapiens 


pleckstrin 2 


1841 


100 


569 


AF239243 


Homo sapiens 


histone deacetylase 7 


3244 


86 


570 


AF087695 


Mus musculus 


veli3 


989 


100 


571 


AB046381 


Homo sapiens 


testis-abundant finger protein 


1346 


99 


572 


AC005551 


Homo sapiens 


R26529_2, partial CDS 


1020 


100 


573 


Y90290 


Homo sapiens 


Himian peptidase, HPEP-7 protein 
sequence. 


274 


52 


574 


W76734 


Homo sapiens 


Human mDia Rho targeting protein. 


712 


32 


575 


AL121935 


Homo sapiens 


bA517H2.3 (t-complex 10 (a murine 
tcp.homolog)) 


853 


78 


576 


Y86217 


Homo sapiens 


Human secreted protein HWHGU54, 
SEQ ID NO: 132. 


2123 


99 


577 


AL121716 


Homo sapiens 


d J202D23 .2 (novel protein) 


6329 


99 


578 


AL121716 


Homo sapiens 


dJ202D23.2 (novel protein) 


6329 


99 


579 


X92715 


Homo sapiens 


KRAB /C2H2 zinc fmger protein 


3102 


97 


580 


X54637 


Homo sapiens 


protein tyrosine kinase 


5564 


98 


581 


X78817 


Homo sapiens 


pll5 


1148 


44 


582 


AJ251245 


Rattus 
norvegicus 


SECIS binding protein 2 


3086 


71 


583 


AF113125 


Homo sapiens 


E-1 enzyme 


581 


100 


584 


M19529 


Sus scrofa 


foUistatin A 


1906 


98 


585 


AF169677 


Homo sapiens 


leucine-rich repeat transmembrane 
protein FLRT3 


3403 


100 


586 


D87685 


Homo sapiens 


similar to human transcription factor 
TFIIS (S34159). 


8083 


99 


587 


Y00876 


Homo sapiens 


Human LAPH-1 protein sequence. 


2110 


100 


588 


Y99674 


Homo sapiens 


Human GTPase associated protein- 
25. 


2111 


99 


589 


D86973 


Homo sapiens 


similar to Yeast translation activator 
GCNl (P1:A48126) 


12033 


99 


590 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen triple 
helix repeat containing protein) 


1979 


100 


591 


Y57396 


Homo sapiens 


Human lysoenzyme LYC4 
polypeptide. 


814 


100 


592 


AJ297743 


Mus musculus 


torsinB protein 


1448 


85 


593 


AF164796 


Homo sapiens 


NADH:ubiqumone oxidoreductase 
MLRQ subunit homolog 


469 


100 


594 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


749 


94- 


595 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


824 


100 


596 


Y77123 


Homo sapiens 


Human neurotransmission-associated 
protein (NTAP) 998868. 


. 2102 


98 


597 


AF215703 


Drosophila 


KISMET-L long isoform 


1880 


65 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


/o 

IDENTITY 






melanogaster 








598 


AF070447 


Homo sapiens 


barrier-to-autointegration factor 


290 


90 


599 


X56203 


Plasmodium 

falciparum 


liver stage antigen 


372 


22 


600 


X79828 


Mus musculus 


NKIO 


202 


53 


601 


AB004109 


Cricetulus 
griseus 


phosphatidylserine synthase n 


2262 


92 


602 


U94988 


Mus musculus 


Nulpl 


2912 


89 


603 


U94988 


Mus musculus 


Nulpl 


2800 


86 


604 


AF006264 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2850 


100 


605 


AF006264 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2530 


100 


606 


X82260 


Homo sapiens 


RanGAPl 


2929 


100 


607 


X82260 


Homo sapiens 


RanGAPl 


1843 


97 


608 


AF160909 


Drosophila 
melanogaster 


BcDNA.LD03471 


943 


58 


610 


X74801 


Homo sapiens 


gamma subunit of CCT chaperonin 


2745 


99 


611 


AL031427 


Homo sapiens 


dJl 67A 19.1 (novel protein) 


1608 


100 


612 


Y71072 


Homo sapiens 


Human membrane transport protein, 
MTRP-17. 


445 


100 


613 


XI 6396 


Homo sapiens 


precursor polypeptide (AA -29 to 
315) 


1749 


100 


614 


AK000281 


Homo sapiens 


unnamed protein product 


1814 


99 


615 


AB011128 


Homo sapiens 


KIAA0556 protein 


5761 


99 


616 


U19361 


Petromyzon 
marinus 


NF-180 


205 


21 


617 


AF045555 


Homo sapiens 


wbscrl 


1208 


100 


618 


AF045555 


Homo sapiens 


wbscrl alternative spliced product 


1318 


100 


619 


U22229 


Felis catus 


ribosomal protein L41 


128 


100 


620 


Y17169 


Homo sapiens 


A6 related protein 


1819 


100 


621 


Y 12065 


Homo sapiens 


hNop56 


2956 


99 


622 


AF177758 


Homo sapiens 


ubiquitin specific protease 16 


2998 


100 


623 


AF3 17425 


Homo sapiens 


GAC-1 


3866 


100 


624 


AL050297 


Homo sapiens 


hypothetical protein 


1227 


99 


625 


AC007204 


Homo sapiens 


BC273239 1 


3398 


99 


626 


Z68747 


Homo sapiens 


Imogen 38 


2024 


99 


627 


Z68747 


Homo sapiens 


imogen 38 


1958 


97 


628 


Y70229 


Homo sapiens 


Human RNA-associated protein- 10 
(RNAAP-10). 


3424 


99 


629 


AF 19 1492 


Homo sapiens 


nasopharyngeal carcinoma associated 
gene protein-8 


613 


100 


630 


AF119664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1574 


100 


631 


AF119664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1150 


89. 


632 


Y 17849 


Homo sapiens 


ganglioside-induced differentiation 
associated protein 1 


1839 


98 


633 


X55740 


Homo sapiens 


5'-nucleotidase 


3012 


100 


634 


AF039688 


Homo sapiens 


antigen NY-CO-3 


931 


100 


635 


AF 11 9662 


Homo sapiens 


E46 protein 


2424 


100 


636 


AB007836 


Homo sapiens 


Hic-5 


2544 


100 


637 


AF077818 


Mus musculus 


syntrophin-associated serme- 
threonine protein kinase 


2027 


44 


638 


AL035455 


Homo sapiens 


dJ1018E9.1 (VAMP (vesicle- 
associated membrane protein)- 
associated protein B and C) 


150 


26 


639 


AF078844 


Homo sapiens 


hqp0376 protem 


416 


81 



142 
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DESCRIPTION 


SMITH- 


% 


IS 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 




640 


U28377 


Escherichia 


ORF__f239; was 0RF_fl91 and 


1198 


100 






coli 


ORF_fl 94 before splice 






641 


AK024442 


Homo sapiens 


FLJ00032 protein 


1677 


56 


642 


U58682 


Homo sapiens 


ribosomal protem S28 


340 


100 


643 


X57432 


Rattus rattus 


ribosomal protein S2 


1520 


98 


644 


AB002348 


Homo sapiens 


KIAA0350 protem 


5186 


99 


646 


Y96202 


Homo sapiens 


IkappaB kinase (IKK) binding 


1178 


98 








protein, Y2H56. 






647 


AB029482 


Mus musculus 


JNK-binding protein JNKBPl 


4609 


81 


648 


AB009053 


Arabidopsis 


contains similarity to isoamyl 


407 


44 






thaliana 


acetate-hydrolyzing 












esterase'-gene_id:MQB225 






650 


AC002550 


Homo sapiens 


Unknown gene product 


858 


99 


651 


U26592 


Homo sapiens 


diabetes mellitus type I autoantigen 


253 


66 


652 


X60155 


Homo sapiens 


zmc fmger 41 


4349 


100 


653 


X53330 


Platynereis 


H4 protein (AA 1-103) 


523 


100 






dumerilii 








654 


AC003682 


Homo sapiens 


R27945 2 


2558 


100 


655 


X80473 


Mus musculus 


rabl9 


596 


56 


656 


J02649 


Rattus 


unknown protein 


201 


95 






norvegicus 








657 


AC006014 


Homo sapiens 


similar to RFP transforming protein; 


1331 


99 








similar to P14373 (PID:gl32517) 






65S 


X92972 


Homo sapiens 


protein phosphatase 6 


1666 


100 


659 


L35269 


Homo sapiens 


zinc finger protein 


2803 


99 


660 


AC003682 


Homo sapiens 


F18547 1 


3184 


96 


661 


X79204 


Homo sapiens 


ataxin~l 


4195 


99 


662 


X17620 


Homo sapiens 


Nm23 protein 


965 


99 


663 


AB015617 


Homo sapiens 


ELKS 


1501 


80 


664 


Z56281 


Homo sapiens 


interferon regulatory factor 3 


2331 


100 


665 


AJ248283 


Pyrococcus 


LACTOYLGLUTATHIONE 


254 


40 






abyssi 


LYASE (EC 4.4.L5) 












METHYLGLYOXALASE) 












(ALDOKETOMUTASE) 












(GLYOXALASE I). 






666 


Z70200 


Homo sapiens 


U5 snRNP-specific 200kD protein 


8819 


99 


667 


Z70200 


Homo sapiens 


U5 snRNP-specific 200kD protein 


8589 


97 


668 


AF153450 


Manduca sexta 


juvenile hormone esterase binding 


225 


32 








protein 






669 


AF227198 


Homo sapiens 


CrkRS 


7231 


99 


670 


X99586 


Homo sapiens 


SMT3C protein 


441 


87 


671 


Z61589_cdl 


Homo sapiens 


17-AUG-1998 DNA encoding a 


2593 


100 








human OC-2 protein. 






672 


AJ132702 


Mus musculus 


ATFa-associated factor 


3240 


88 


673 


AF204159 


Homo sapiens 


potassium large conductance 


1486 


100 








calcium-activated channel beta 3a 












subunit 






674 


G02061 


Homo sapiens 


Human secreted protein, SEQ ID 


558 


99 








NO: 6142. 






675 


G01246 


Homo sapiens 


Human secreted protein, SEQ ID 


141 


77 








NO: 5327. 






676 


AB016839 


Homo sapiens 


mobl 


419 


42 


677 


D86970 


Homo sapiens 


similar to myosin heavy chain: 


161 


28. 








Containing ATP/GTP-binding site 












motif A(P-Ioop) 






678 


U83115 


Homo sapiens 


non-lens beta gamma-crystallin like 


8569 


99 








protein 






679 


AF203687 


Homo sapiens 


prolactin regulatory element-binding 


2181 


100 








protein 
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% 
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680 


M27685 


Mus musculus 


ultra-high sulphur keratin 


650 


58 


681 


U04968 


Cricetulus 
griseus 


nucleotide excision repair protein 


3712 


97 


682 


AF 11 9663 


Homo sapiens 


G-protein gamma- 12 subunit 


356 


100 


683 


G03733 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: /ol4. 


342 


100 


HQ A 


Xo7699 


Homo sapiens 


CDwjz antigen 


297 


100 


065 


AF022789 


Homo sapiens 


ubiquitin hydrolyzing enzyme I 


1892 


100 


686 


A TAAt AA^ 

AJOOlOOo 


Mus musculus 


EMeg32 protein 


938 


96 


687 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 


1864 


100 


688 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


689 


AFl 56557 


Homo sapiens 


stomatin related protein 


2036 


100 


690 


G03960 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8041. 


593 • 


100 


691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


692 


AL031115 


Homo sapiens 


ZXDA, ZXDB (zinc finger X-linked 
protein) 


4298 


100 


693 


L40410 


Homo sapiens 


thyroid receptor interactor 


806 


100 


694 


AC004542 


Homo sapiens 


OXYSTEROL-BINDING 
PROTEIN-like; similar to P22059 
(PID:gl29308) 


2533 


99 


695 


AF169411 


Rattus 
norvegicus 


PAPIN 


4144 


52 


696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 
4. 


2144 


100 


697 


AF271994 


Homo sapiens 


dopamine responsive protein DRG-1 


1613 


100 


698 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


699 


AL133506 


Unknown 


/prediction=(method: ""genscan" 
version:"" 1 .0"", score:"" 109. 13""); 
/prediction=(niethod: 


825 


48 


700 


Y96870 


Homo sapiens 


Human goose-type lysozyme 
(GOLY). 


1032 


100 


701 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


1190 


100 


702 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


937 


95 


703 


. AJ242832 


Homo sapiens 


calpain 


3756 


100 


704 


S52624 


Homo sapiens 


unknown 


185 


100 


705. 


AF005081 


Homo sapiens 


skm-specific protein 


652 


100 


706 


Y 16793 


Homo sapiens 


keratin, type 1 


2232 


100 


707 


Y44985 


Homo sapiens 


Human epidermal protem-2. 


455 


69 


708 


AFl 13220 


Homo sapiens 


MSTP040 


686 


100 


709 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


408 


65 


710 


Y16132 


Homo sapiens 


CDT6 


1874 


100 . 


711 


Y68775 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-7. 


2407 


100 


712 


A63422 


Homo sapiens 


H(+)-transportmg ATP synthase 


209 


100 


713 


Ar 169968 


Mus musculus 


DNA bmding protem DESRT 


1467 


79 


714 


A52563 


Bos taunis 


permability increasing protein 


383 


29 


715 


Aj277739 


Homo sapiens 


RPB 1 1 b 1 alpha protem 


480 


98 


716 


AL135791 


Homo sapiens 


bA162G10.3 (zinc finger protein) 


401 


98 


717 


AF223466 


Homo sapiens 


HTO 15 protein 


1311 


97 


719 


AFl 17383 


Homo sapiens 


placental protein 13; PP13 


746 


100 


720 


Z98743 


Homo sapiens 


dJ181C9.2 (Rho GTPase activating 
protem 8 (RnoGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


G01436 


Homo sapiens 


Human secreted protein, SEQ ID 


418 


96 
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SAflTH- 
WATERMAN 
SCORE 


% 

IDENTITY 








NO: 5517. 






723 


AF282919 


Mus musculus 


Zfp228 


349 


49 


724 


AB023191 


Homo sapiens 


KIA A0974 protein 


2953 


100 


725 


AL031778 


Homo sapiens 


dJ34B21.1 (novel BZRP 
(benzodiazapine receptor (peripheral) 
(MBR, PBR, PBKS, IBP, 
Isoquinoline-binding protein)) LIKE 
protein) 


920 


100 


726 


AL021939 


Homo sapiens 


dJ352A20.2 (aldehyde 
dehydrogenase family protem) 


1764 


100 


in 


AFl 82426 


Rattus 

norvegicus 


arylacetamide deacetylase 


791 


42 


728 


Y08565 


Homo sapiens 


UDP-GalNAc:polypeptide N- 
acetylgalactosaminyltransferase 


3331 


99 


729 


AF155135 


Homo sapiens 


novel retinal pigment epithelial cell 
protein 


1652 


99 


730 


AL078606 


Arabidopsis 
thaliana 


putative proteui 


277 


55 


731 


Y73352 


Homo sapiens 


HTRM clone 1732368 protein 
sequence. 


1720 


100 


732 


AF178432 


Homo sapiens 


SH3 protein 


3302 


100 


733 


Y17832 


Human 
endogenous 
retrovirus K 


env protein 


223 


34 


734 


Y28859 


Homo sapiens 


Human mesoderm induction early 
response protein ERl. 


2067 


98 


735 


U09355 


Oryctolagus 
cuniculus 


protein phosphatase 2A1 B gamma 
subunit 


2352 


99 


736 


Y94922 


Homo sapiens 


Human secreted protein clone pv6__I 
protein sequence SEQ ID NO:50, 


724 


99 


137 


AB027003 


Mus musculus 


protein phosphatase 


378 


84 


738 


AF112200 


Homo sapiens 


NADH-oxidoreductase B18 subunit 


739 


100 


739 


AF112200 


Homo sapiens 


NADH-oxidoreductase B18 subunit 


613 


88 


740 


AF302154 


Homo sapiens 


SPG protein 


6556 


100 


741 


B25681 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 17 SEQ ID NO:70. 


1410 


99 


742 


L27479 


Homo sapiens 


X123 


1237 


99 


743 


L27479 


Homo sapiens 


X123 


1206 


97 


744 


Y66745 


Homo sapiens 


Membrane-bound protein PR01186. 


588 


99 


745 


AJ001019 


Homo sapiens 


ring finger protein 


1292 


99 


746 


X68453 


Sus scrofa 


tubulin-tyrosine ligase 


1882 


94 


747 


YSl^Sl 


Homo sapiens 


Human transmembrane protein 
HTMPN-2L 


1173 


100 


748 


AFl 5 1069 


Homo sapiens 


HSPC235 


1694 


96 


749 


AFl 82404 


Homo sapiens 


mitochondrial uncoupling protein 1 


1674 


100 


750 


AL121993 


Homo sapiens 


dJ776P7.1 (Novel protein) 


2500 


99 


751 


AF149825 


Homo sapiens 


PACSIN3 


2253 


100 


752 


AL008635 


Homo sapiens 


dJ510H16.2 (high-mobility group 
protein 2-like 1) 


3026 


99 


753 


Y57914 


Homo sapiens 


Human transmembrane protein 
HTMPN-38. 


1124 


100 


754 


AF285109 


Homo sapiens 


septin 3 isofonn B 


1766 


100 


755 


AF004161 


Oryctolagus 
cuniculus 


peroxisomal Ca-dependent solute 

carrier 


2371 


95 


756 


Z19585 


Homo sapiens 


thrombospondin-4 


4239 


100 


757 


AP001745 


Homo sapiens 


similar to zinc finger 5 protein 


1857 


100 


758 


AFl 90664 


Mus musculus 


LMBR2 


. 555 


72 


759 


AF090326 


Mus musculus 


AE-1 binding protein AEBP2 


1540 


97 


760 


AL096677 


Homo sapiens 


dJ322G13.3 (novel protem similar to 


999 


94 
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% 
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bovine and mouse beta-soluble NSF 
attachment protein (SNAP-beta) ) 






761 


AC003007 


Homo sapiens 


Unknown gene product (partial) 


649 


96 


762 


U66372 


Bos taurus 


ribosomal protein S29 


230 


73 


764 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity 
modifying protein SEQ ID NO: 1 . 


1152 


100 


765 


U88169 . 


Caenorhabditis 
elegans • 


similar to molybdoterin biosynthesis 
MOEB proteins 


1204 


65 


766 


ALl 18506 


Homo sapiens 


dJ591C20.3.1 (novel DnaJ domain 
protein, similar to mouse and bovine 
cysteine string protein) 


1091 


100 


767 


AK024693 


Homo sapiens 


unnamed protein product 


3767 


100 


768 


Z11518 


Homo sapiens 


histidyl-tRNA synthetase 


2582 


100 


769 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


25529 


100 


770 


AC009360 


Arabidopsis 
thaliana ' 


Contains 3 PF|00400 WD40, G-beta 
repeat domains. 


333 


33 


771 


AB037685 


Mus musculus 


LANP-like protein 


1246 


91 


772 


AL161578 


Arabidopsis 
thaliana 


putative protein 


335 


46 


773 


AL161578 


Arabidopsis 
thaliana 


putative protein 


333 


47 


774 


AY008271 


Homo sapiens 


helicase SMARCADl 


5264 


99 


775 


Y21591 


Homo sapiens 


Human secreted protein (clone . 
CC332-33). 


1127 


96 


776 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


777 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


778 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


779 


AF 196481 


Homo sapiens 


RING finger protein; FXY2 


3644 


100 


780 


AL035427 


Homo sapiens 


dJ769N13.1 (KIAA0443 protein.) 


1609 


54 


781 


AB026187 


Homo sapiens 


protocadherin-Xa 


5244 


100 


782 


B24458 


Homo sapiens 


Hmnan secreted protein sequence 
encoded by gene 22 SEQ ID NO:83. 


1002 


100 


783 


.AB027289 


Homo sapiens 


cyclin-E binding protein 1 


5421 


100 


784 


G02916 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6997. 


627 


100 


785 


AJ245822 


Homo sapiens 


type I transmembrane receptor 


4560 


100 


786 


AJ245820 


Homo sapiens 


type I transmembrane receptor 


4624 


100 


787 


Z48042 


Homo sapiens 


GPI-anchored protein pi 37 


3340 


99 


788 


AL031782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protein) 


2739 


100 


789 


AJ131245 


Homo sapiens 


Sec24B protein 


6602 


100 


790 


AFl 07203 


Homo sapiens 


ataxin 2-binding protein 


2008 


100 


791 


Y 14690 


Homo sapiens 


procollagen alpha 2(V) 


600 


34 


792 


AL031055 


Homo sapiens 


dJ28H20.2 (novel protein) 


1267 


100 


793 


Y36194 


787 


Human secreted protein 


2051 


99 


794 


AB028127 


Homo sapiens 


mannosyltransferase 


2138 


96 


795 


AC007228 


Homo sapiens 


R31665 2 


2738 


79 


796 


AL049482 


Arabidopsis 
thaliana 


putative protein 


436 


47 


797 


AC004528 


Homo sapiens 


R32184 3 


891 


91 


798 


AB037830 


Homo sapiens 


KIAA1409 protein 


7532 


100 


799 


X53793 


Homo sapiens 


5* half of the product is homologues 
to Bacillus subtiis SAICAR 
synthetase, 3' half corresponds to the 
catalytic subunit of AIR carboxylase 


2232 


100 
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800 


Y99350 


Homo sapiens 


Human PR01378 (UNQ715) amino 
acid sequence SEQ ID NO:33. 


1343 


100 


801 . 


AB042636 


Homo sapiens 


junctophilin type3 


1225 


47 


802 


AB029324 


Rattus 
norvegicus 


TIP120-family protein TIP120B 


3916 


90 


803 


AB029324 


Rattus 
norvegicus 


TIP120-family protein TIP120B 


4961 


90 


804 


AF251040 


Homo sapiens 


putative nuclear protein 


2119 


100 


80S 


AB033281 


Homo sapiens 


F-box and WD-repeats protein beta- 
TRCP2 isoform C 


2879 


100 


806 


U87305 


Rattus 

norvegicus 


transmembrane receptor IJNC5H1 


3257 


90 


807 


AF118889 


Rattus 
norvegicus 


b-tomosyn isoform 


3155 


97 


808 


AF226993 


Rattus 
norvegicus 


selective LIM binding factor 


8793 


95 


809 


W19919 


Homo sapiens 


Human Ksr-1 (kinase suppressor of 
Ras). 


3939 


99 


810 


AL03I782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protein) 


1546 


100 


811 


AC002542 


Homo sapiens 


similar to C. elegans F11A10.5; 80% 
similarity to Z68297 (PIDrgl 130619) 


2294 


100 


812 


U83246 


Homo sapiens 


copine I 


606 


52 


813 


AF242552 


Galius gallus 


retinovin 


945 


34 


814 


X52332 


Honiio sapiens 


zinc finger protein 10 


1651 


93 


815 


X52332 


Homo sapiens 


zinc finger protein 1 0 


2423 


99 


816 


Y09631 


Homo sapiens 


PIBFl protein 


2935 


99 


817 


X71997 


Rattus 
norvegicus 


myosin I 


3883 


98 


818 


AY004877 


Mus musculus 


cytoplasmic dynein heavy chain ' 


11105 


98 


819 


Y27196 


Homo sapiens 


Human cyclic nucleotide 
phosphodiester PDE8B(E) amino 
acid sequence. 


3790 


100 


820 


AF081947 


Mus musculus 


tektin 


1134 


81 


821 


AL035106 


Homo sapiens 


dJ998Cll.l (continues in 
Em:AL445192 as bA269H4.1) 


871 


100 


822 


AF022795 


Homo sapiens 


TGF beta receptor associated protein- 
1 


385 


24 


823 


AF015770 


Mus musculus 


radical fringe 


1422 


82 


824 


U82695 


Homo sapiens 


expressed-Xq28STS protein 


1444 


99 


825 


X77371 


Mesocricetus 
auratus 


CORl 


641 


78 


826 


AB014576 


Homo sapiens 


iaAA0676 protein 


296 


79 


827 


AL049733 


Homo sapiens 


dJ875H3.1 (APKl antigen) 


1584 


72 


828 


AF222980 


Homo sapiens 


disrupted in Schizophrenia 1 protein 


4418 


100 


829 


Z31560 


Homo sapiens 


sox-2 


1683 


100 


830 


AF295773 


Homo sapiens 


ral guanine nucleotide dissociation 
stimulator 


4717 


99 


831 


AB041926 


Homo sapiens 


GCK family kinase MINK-2 


6866 


100 


832 


L04948 


Saccharomyce 
s cerevisiae 


mitochondrial transporter protein 


338 . 


35 


833 


AJ007012 


Mus musculus 


Fish protein 


704 


94 


834 


Z34289 


Homo sapiens 


nucleolar phosphoprotein pi 30 


3455 


99 


835 


U10991 


Homo sapiens 


G2 


8436 


98 


836 


AF230877 


Homo sapiens 


MIP-T3 


2945 


99 


837 


X58288 


Homo sapiens 


protein-tyrosine phosphatase 


7734 


99 


838 


X56958 


Homo sapiens 


ankyrin (brank-2) 


9631 


100 


839 


AC024791 


Caenorhabditis 
elegans 


contains similarity to beta-lactamases 


370 


24 
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840 


D83197 


Homo sapiens 


ankyrin repeat protein 


802 


99 


841 


AF053711 


Serinus 
canaria 


neurofilament medium subunit 


192 


31 


842 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosomal 
protein LIO encoded by GenBank 
Accession Number L25899 


990 


96 


843 


U76343 


Homo sapiens 


GABA transport protein 


2992 


98 


844 


Y13645 


Homo sapiens 


uroplakin II 


897 


100 


845 


D21064 


Homo sapiens 


similar to rat general mitochondrial 
matrix processing protease mRNA 
(RATMPP). 


2710 


99 


846 


AFl 92522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


7047 


100 


847 


AF 192522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


5472 


100 


848 


X60489 


Homo sapiens 


elongation factor- 1 -beta 


1162 


100 


849 


AC007204 


Homo sapiens 


BC273239 1 


2277 


67 


850 


AC003682 


Homo sapiens 


R28830 1 


2401 


100 


851 


AL121583 


Homo sapiens 


bA358N2. 1 (novel protein) 


353 


61 


852 


Z48475 


Homo sapiens 


glucokinase regulator 


3155 


99 


853 


Z83844 


Homo sapiens 


dJ37E16.2 (SH3-domain binding 
protein 1) 


1884 


98 


854 


AF233323 


Homo sapiens 


Fas-associated phosphatase- 1 


390 


36 . 


855 


AF062741 


Rattus 
norvegicus 


nvmvate dehvdroeenase nhosohatase 
isoenzyme 2 


447 


80 


856 


Y11411 


Homo sapiens 


pristanoyl-CoA oxidase 


3595 


98 


857 


M97188 


Strongylocentr 
otus 

purpuratus 


tektin Al 


290 


46 












858 


AB001105 


Homo sapiens 


hippocalcin-like protein 4 


995 


100 


859 


AF164791 


Homo sapiens 


putative 38.3kDa protein 


1795 


100 


860 


AF298117 


Homo sapiens 


homeobox protein 0TX2 


1477 


93 


861 


AF015264 


Rattus 
norvegicus 


golgi peripheral membrane protein 
p65 


1820 


81 


862 


X16901 


Homo sapiens 


30kb subunit of RAB30 /74 


1284 


100 


863 


M12140 


Homo sapiens 


envelope protein 


202 


81 


864 


AF161459 


Homo sapiens 


HSPC109 


815 


98 


865 


AL109983 


Homo sapiens 


dJ718Pll.l.l (novel class n 
aminotransferase similar to serme 
palmotyltransferase (isoform 1)) 


444 


100 


866 


M77183 


Rattus 
norvegicus 


alpha- 1 -macroglobulin 


227 


45 


867 


AF272663 


Homo sapiens 


gephyrin 


3785 


100 


868 


X75285 


Mus musculus 


fibulin-2 


3258 


87 


869 


X82494 


Homo sapiens 


fibulin-2 


3407 


99 


870 


AJ297743 


Mus musculus 


torsinB protein 


169 


43 


87] 


AJ278313 


Homo sapiens 


phospholipase C-beta-la 


6258 


99 


872 


AF073344 


Homo sapiens 


ubiquitin-specific protease 3 


256 


43 


873 


Y91955 


Homo sapiens 


Human cytoskeleton associated 
protein lO(CYSKP-lO). 


535 


100 


874 


AJ000414 


Homo sapiens 


Cdc42-interacting protein 4 


1136 


53 


875 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domam 
enzyme APOLLON 


627 


100 


876 


Y48586 


Homo sapiens 


Human breast tumour-associated 
protein 47. 


2537 


98 


877 


AF182198 


Homo sapiens 


intersectin 2 long isoform 


8764 


99 


878 


L17308 


Gossypium 
hirsutum 


proline-rich cell wall protein 


192 


35 


879 


AF177169 


Homo sapiens 


tropomodulin 2 


1769 


100 


880 


W03627 


Homo sapiens 


Human follicle stimulating honnone 
GPRN-terminal sequence. 


210 


23 
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881 


AL021068 


Homo sapiens 


dJ206D15.3 


2615 


99 


882 


AC005498 


Homo sapiens 


R31665 2 


318 


82 


883 


AF165518 


Homo sapiens 


MAGOH isoform 


182 


94 


884 


D21211 


Homo sapiens 


protein tyrosine phosphatase (PTP- 
BAS, type 3) 


368 


43 


885 


U13045 


Homo sapiens 


nuclear respiratory factor-2 subunit 
beta 1 


869 


62 


886 


X52836 


Homo sapiens 


tryptophan hydroxylase (AA 1 - 444) 


2320 


98 


887 


X51466 


Homo sapiens 


elongation factor 2 


4460 


100 


888 


AB039903 


Homo sapiens 


interferon-responsive finger protein 1 
long form 


1096 


98 


889 


X51760 


Homo sapiens 


zinc finger protein (583 AA) 


3130 


100 


890 


AJ243396 


Homo sapiens 


voltage-gated sodium channel beta-3 
subunit 


1024 


100 


891 


W67928 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 4. 


391 


100 


892 


AB020598 


Homo sapiens 


peptide transporter 3 


3017 


100 


893 


Y66648 


Homo sapiens 


Membrane-bound protein PROl 120. 


4722 


99 


894 


Y66648 


Homo sapiens 


Membrane-bound protein PROl 120. 


3606 


96 


895 

■ 


A29218 cd 
1 


Homo sapiens 


19-NOV-1998 DNA encoding G- 
protein coupled 7 TM receptor with 
AXORl 5 activity. 


2178 


100 


896 


AJ000332 


Homo sapiens 


Glucosidase II 


5063 


99 


897 


X98259 


Homo sapiens 


M-phase phosphoprotein 8 


1085 


100 


898 


X57110 


Homo sapiens 


c-cbl protein 


4849 


99 


899 


X63652 


Homo sapiens 


inter-alpha-trypsin inhibitor heavy 
chain ITIHl 


3376 


98 


900 


X85134 


Homo sapiens 


RB protem binding protein 


2816 


99 


901 


LI 1672 


Homo sapiens 


zinc finger protein 


2047 


58 


902 


Y85565 


Homo sapiens 


Human homologue of lJNC-53 (Hs- 
UNC-53/2) sequence. 


369 


83 


903 


X54871 


Homo sapiens 


ras related protein Rab5b 


1094 


100 


904 


Z98265 


Homo sapiens 


plakophilin 3 


4065 


100 


905 


AL035295 


Homo sapiens 


hypothetical protein 


959 


99 


906 


AF051782 


Homo sapiens 


diaphanous 1 


801 


35 


907 


AF208536 


Homo sapiens 


nucleotide binding protein; NBP 


1372 


100 


908 


U79240 


Homo sapiens 


serine/threonine protein kinase 


2365 


98 


909 


U79240 


Homo sapiens 


serine/threonine protein kinase 


2386 


99 


910 


AJ132545 


Homo sapiens 


protein kinase 


2921 


100 


911 


AJ 132545 


Homo sapiens 


protein kinase 


1637 


99 


912 


AL121733 


Homo sapiens 


hypothetical protein 


1344 


99 


913 


Y67579 


Homo sapiens 


Human death inducer-obliterator 1 
(DIO-1) polypeptide. 


1586 


100 


914 


X87342 


Homo sapiens 


Human giaiit larvae homologue 


5317 


99 


915 


X87342 


Homo sapiens 


Human giant larvae homologue 


3495 


96 


916 


M94362 


Homo sapiens 


lamin B2 


2357 


93 


917 


AJO 11654 


Homo sapiens 


triple LIM domain protein 


3432 


100 


918 


AJ131899 


Rattus 
norvegicus 


proline rich synapse associated 

protein 1 


5776 


88 


919 


AF054986 


Homo sapiens 


putative transmembrane GTPase 


1816 


100 


920 


U95822 


Homo sapiens 


putative transmembrane GTPase 


1237 


100 


921 


Y11588 


Homo sapiens 


apoptosis specific protein 


1492 


100 


922 


X84195 


Homo sapiens 


acylphosphatase 


510 


100 


923 


U72882 


Homo sapiens 


interferon-induced leucine zipper 
protein 


1409 


99 


924 


AE000660 


Homo sapiens 


hADV36Sl 


573 


100 


925 


AF126245 


Homo sapiens 


acyl-Coenzyme A dehydrogenase-8 
precursor 


2162 


100 
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ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


OlTll g o- 

WATERMAN 
SCORE 


% 

IDENTITY 


926 


AE001968 


Deinococcus 
radiodurans 


hypothetical protein 


147 


27 


927 


W81576 


Homo sapiens 


EBV-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 


928 


U01317 


Homo sapiens 


beta-globin 


687 


94 


929 


X98333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Himian secreted protein sequence 
encoded by gene 42 SEQ ED 
NO: 165. 


1401 


100 


931 


Y91644 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 43 SEQ ID 
NO:317. 


1243 


100 


932 


D90279. 


Homo sapiens 


collagen alpha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


934 


AF 147790 


.Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 


match: multiple proteins; match: 
Q08151 P28185 QOl 1 1 1 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368 P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 
match: P10949 PI 1023 Q16948 
Q20337; match: Q25389 P25228 
P20336 P05713; match: P35276 
Q08147 P17609 P22128; match: 
Q15771 P36410P35291; GTP- 
bindmg 


726 


94 


936 


AB041533 


Homo sapiens 


sperm antigen 


1054 


38 


937 


X91906 


Homo sapiens 


voltage-gated chloride ion channel 


3914 


100 


938 


AB032481 


Homo sapiens 


homeobox transcription factor 


1744 


100 


939 


AF111106 


Homo sapiens 


protein serine/threonine phosphatase 
4 regulatory subunit 1 


4682 


99 


940 


Y 17999 


Homo sapiens 


DyrklB protein kinase 


3331 


99 


941 


AF305872 


Homo sapiens 


thyroglobulin 


455 


92 


942 


AF263462 


Homo sapiens 


cingulin 


5939 


99 


943 


AK024442 


Homo sapiens 


FLJ00032 protein 


1616 


61 


944 


Y35911 


Homo sapiens 


Extended human secreted protein 
sequence, SEQ ID NO. 160. 


262 


35 


945 


ABO 15320 


Homo sapiens 


sigmalB subunit of AP-1 clathrin 
adaptor complex 


599 


71 


946 


Z82287 


Caenorhabditis 
elegans 


ZK550.2 


229 


35 


947 


D84223 


Homo sapiens 


leucyl tiRNA synthetase 


6207 


99 


948 


U49057 


Rattus 
norvegicus 


rA9 


3846 


62 


949 


AK000568 


Homo sapiens 


unnamed protein product 


1659 


100 


950 


AL021578 


Homo sapiens 


dJ453C12.6.1 (uncharacterized 
hypothalamus protein (isoform 1)) 


257 


42 


951 


AB032435 


Homo sapiens 


differentiation-associated Na- 
dependent inorganic phosphate 
cotransporter 


3063 


99 


952 


AFl 10532 


Homo sapiens 


uncoupling protein UCP-4 


1561 


100 


953 


X83587 


Mus musculus 


1A13 protein 


1420 


59 


954 


AL031665 


Homo sapiens 


dJ545L17.5.1 (novel protein) 


386 


53 


955 


Y87600 


Homo sapiens 


Human fatty acid synthase-like 
protein (HFASLP). 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PR01433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 


522 


55 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

JDENTITY 


957 


U68535 


Mus musculus 


aldo-keto reductase 


451 


73 


958 


AC007067 


Arabidopsis 
thaliana 


T10O24.10 


1594 


57 


959 


U72194 


Mus musculus 


muskelin 


3947 


99 


960 


AE003661 


Drosophila 
melanogaster 


CGI 5 168 gene product 


277 


54 


961 


X80332 


Mus musculus 


rab20 


983 


82 


962 


Y67315 


Homo sapiens 


Human secreted protein BL89_13 
amino acid sequence. 


3916 


99 


963 


Y67315 


Homo sapiens 


Human secreted protein BL89_13 
amino acid sequence. 


3916 


99 


964 


L32602 


Rattus 

norvegicus 


homeodomain 159. .341 


1821 


96 


965 


Z97832 


Homo sapiens 


dJ329A5.3 (KIAA06460 protem) 


3581 


99 


966 


W88995 


Homo sapiens 


Polypeptide fragment encoded by 
gene 146. 


176 


39 


967 


U12465 


Homo sapiens 


ribosomal protein L35 


604 


100 


968 


AF151803 


Homo sapiens 


CGI-45 protein 


1101 


78 


969 


W74865 


Homo sapiens 


Human secreted protein encoded by 
gene 137 clone HMWIF35. 


1348 


98 


970 


L21936 


Homo sapiens 


succinate dehydrogenase flavoprotein 
subunit 


703 


100 . 


971 


AJ133521 


Drosophila 
buzzatii 


protease, reverse transcriptase, 
ribonuclease H, integrase 


194 


23 


972 


AC006017 


Homo sapiens 


N-acetylgalactosaminyltransferase; 
similar to Q10473 (PIDrg 1709559) 


3271 


100 


973 


Z81317 


Schizosacchar 
omyces pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


974 


Ml 7885 


Homo sapiens 


acidic ribosomal phosphoprotein (PO) 


792 


100 


975 


U22829 


Mus musculus 


P2Y purinoceptor 


399 


40 


976 


AL132772 


Homo sapiens 


dJ1013A22.1 (hepatic nuclear fector 

4, alpha) 


2466 


99 


977 


AC003973 


Homo sapiens 


ZNF91L 


1550 


43 


978 


J04031 


Homo sapiens 


MDMCSF (EC 1.5.1.5; EC 3.5.4.9; 
EC 6.3.4.3) 


2824 


63 


979 


AF136715 


Homo sapiens 


taxol resistant associated protein 


217 


76 


980 


AF136715 


Homo sapiens 


taxol resistant associated protein 


306 


95 


981 


Z92822 


Caenorhabditis 
elegans 


ZK520.1 


1109 


44 


982 


AJ295149 


Homo sapiens 


putative dipeptidase 


1564 


99 


983 


AL021331 


Homo sapiens 


dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosine Ligase LIKE) 


1492 


100 


984 


AL161501 


Arabidopsis 
thaliana 


putative adenosine deaminase 


370 


38 



TABLES 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 4.259e.l4 97-120 


3 


BL00298 


Heat shock hsp90 protems family 
proteins. 


BL00298A 10.97 l.OOOe-40 74- 
119 BL00298E 27.30 l.OOOe-40 
321-376 BL00298F 11.21 l.OOOe- 
40 409-464 BL00298H 20.50 
I.OOOe-40 553-607 BL00298C 
16.40 2.286e-40 186-230 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00298B 15.64 1.2906-39 134- 
181 BL00298G 24.57 5.345e-39 
465-520 BLO02981 30.07 7.81 8e- 
34 661-715 BL00298D 17.97 
6.226e-33 242-282 


4 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 4.316e-13 57-82 


5 


PD02454 


! ! ! ! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD02454B 11.61 4.309e-17 75- 
103 


6 


DM00864 


EGF-LIKE DOMAIN. 


DM00864A 15.21 7.429e-09 98- 
119 


7 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 1.750e-ll 29-54 
PR00237D 8.94 7.000e-09 138- 
160 PR00237B 13.50 8.250e-09 
61-83 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.6676-15 272-289 


10 


BL00139 


Eukaiyotic thiol (cysteine) proteases 
cysteine proteins. 


BL00139D 9.24 4.400e-ll 391- 
408 BL00139A 10.29 7.5116-09 
67-77 


12 


BL01113 


C 1 q domain proteins. 


BL01113B 18.26 9.294e-19 689- 
725 BL01113C 13.184.8576-11 
757-777 BL01113D7.47 2.161e- 
10 790-800 


13 


BL01113 


Clq domain proteins. 


BL01113B 18.26 3.813e-14 599- 
635 BL01113C 13.18 4.857e-ll 
667-687 BL01113D7.47 2.161e- 
10 700-710 


14 


BL00594 


Aromatic amino acids permeases 
proteins. 


BL00594A 16.75 6.53 le- 10 50-94 


15 


BL01047 


Heavy-metal-associated domain proteins. 


BL01047B 19.73 4.913e-13 707- 
728 


16 


PR00625 


DNA J PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 7.462e-18 310- 
330 PR00625B 13.48 3.939e-15 
340-361 


18 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 3.700e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 


PR00741D 16.11 9.0826-21 175- 
195 PR00741F 14.66 9.2626-21 
243-265 PR00741B 14.23 1.947e- 
18 128-145 PR00741G9.29 
2.1806-17 318-340 PR00741C 
9.16 7.3286-17 147-166 
PR00741H 10.32 2.141e-13 351- 
374 PR00741A924 3.596e-13 
89-105 PR00741E 13.39 3.535e- 
12 215-232 


22 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.6476-20 117- 
148 BL00107B 13.31 l.OOOe-16 
182-198 


23 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


24 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


27 


BL00239 


Receptor tyrosine kinase class 11 proteins. 


BL00239B 25.15 2.324e-16 91- 
139 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e-10 681-694 
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SEQ 

m 

NO: 


ACCESSION 
NO. ' 


DESCRIPTION 

i 


RESULTS* 






proteins. 


BL00018 7.41 6.400e-10 717-730 


30 


.BL01113 


eiq domain proteins. 


BLOl 1 13A 17.99 9.308e-09 54-81 


, 33 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01168L9.47 1.667e-09 401- 
416 


34 


PD0I168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PDOl 168L 9.47 1 .667e-09 411- 
426 


36 


PR00426 


C5A-ANAPHYLATOXIN RECEPTOR 
SIGNATURE 


PR00426D 10.59 3.618e-12 110- 

122 


37 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 2.049e-10 1080- 
1135 


38 


BL00350 


MADS-box domain proteins. 


BL00350 20.79 l.OOOe-40 1-55 


40 


BL00123 


Alkaline phosphatase proteins. 


BL00123B 19.31 l.OOOe-40 90- 
133 BL00123C 24.61 l.OOOe-40 
145-195 BL00123E 22.25 l.OOOe- 
40 304-358 BL00123G 26.01 
1.000e-40 438-488 BL00123F 
19.03 8.714e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 l.OOOe-17 216- 
229 


44 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 2.800e-14 346-359 
PD00066 13.92 4.600e-14 486-499 
PD00066 13.92 l.OOOe- 13 374-387 
PD00066 13.92 6.000e-13 458-471 
PD00066 13.92 2.714e-12 234-247 
PD00066 13.92 3.1436-12430-443 
PD00066 13.92 8.7 14e- 12 514-527 
PD00066 13.92 3.739e-ll 402-415 
PD00066 13.92 2.038e-10 318-331 


45 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 2.946e-10 180- 
217 


47 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 1.682e-10 475- 
501 BL00649B 20.687.3876-09 
417-463 


50 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 445-458 
PD00066 13.92 5.846e-15 305-318 
PD00066 13.92 l.OOOe-14 221-234 
PD00066 13.92 l.OOOe-14 417-430 
PD00066 13.92 2,8006-14 249-262 
PD00066 13.92 2,800e-14 277-290 
PD00066 13.92 8.800e-14 333-346 
PD00066 13.92 9.400e-14 361-374 
PD00066 13.92 4.0006-13 389-402 
PD00066 13.92 6.5716-12 473-486 


51 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 l,000e-40 417- 
464 BL00226B 23.86 3.3486-35 
251-299 BL00226C 13.23 1.429e- 
24 316-347 BL00226A 12.77 
L857e-15 151-166 


52 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 5.648e-09 133- 
149' 


53 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 1. 0006-40 143- 
191 BL00232A 27.72 2.350e-28 
49-82 BL00232B 32.79 7.052e-21 
252-300 BL00232C 10.65 6.625e- 
20 250-268 BL00232B 32.79 
1.314e-ll 367-415 BL00232C 
10.65 9.3086-10470-488 


54 


BL00303 


S-lOO/ICaBP type calcium binding 


BL00303B 26.15 8.759e-23 125- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






protein. 


162 BL00303A 21,77 l.OOOe-21 
82-119 


58 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 l.OOOe-15 242- 
261 PR00378B 13.80 9.250e-13 
109-129 


59 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 9.040e-12 120- 
140 


60 


BL00280 


Pancreatic trypsin inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61 1.5 14e-30 294-338 


65 


BL01019 


ADP-ribosylation factors family proteins. 


BL01019A 13.20 1.222e-ll 43-83 


68 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAIVULY SIGNATURE 


PR00237E 13.03 5.091e-13 188- 
212 PR00237G 19.63 7.207e-13 
268-295 PR00237A 11.48 4.375e- 
1124-49 PR00237C 15.69 
3.057e-10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e-10230- 
255 PR00237B 13.50 9.438e-10 
57-79 


70 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.93 8e-28 31-70 


71 


PR00830 


ENDOPEPTIDASE LA (LON) SERINE 
PROTEASE (S16) SIGNATURE 


PR00830A8.41 8.759e-12348- 
368 


72 


BL00120 


Lipases, serine proteins. 


BL00120B 11.372.149e.l0 148- 

163 


77 


PR00753 


1 - AMINOC YCLOPROPANE- 1 - 
CARBOXYLATE SYNTHASE 
SIGNATURE 


PR00753E8.01 3.552e-ll 191- 
216 PR00753D 6.85 2.778e-09 
131-153 


78 


PR00506 


D21 CLASS N6 ADENINE-SPECIFIC 
DNA METHYLTRANSFERASE 
SIGNATURE 


PR00506C 19.40 8.01 7e-09 96- 
119 


82 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.571e-16 436- 

467 


84 


BL00675 


Sigma-54 interaction domain proteins 
ATP-binding region A proteins. 


BL00675A 24.86 8.800e-10 256- 
300 


85 


BL00027 


*Homeobox* domain proteins. 


BL00027 26.43 2.286e-30 1 17-160 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21.24 6.786e-36 264- 
300 BL00250B 27.37 1.450e-26 
328-364 


91 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.250e-17 10-35 
BL00215A 15.82 6.000e-16221- 
246 BL00215A 15.82 7.857e-12 
108-133 BL00215B 10.44 9.526e- 
11 168-181 


92 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.526e-24 324-367 


95 


PR00094 


ADENYLATE KINASE SIGNATURE 


PR00094C 12.94 l.OOOe-08 119- 

136 


96 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327B 19.84 2,091e.09 143- 

165 


97 


BL00752 


XPA protein. 


BL00752B 19.17 7.3 09e-09 28-72 


98 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e-10 135- 
149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.824e-12 122- 
141 


100 


BL00027 


•Homeobox' domain proteins. 


BL00027 26.43 7.429e-31 118-161 


101 


BL00028 


Zinc fmger, C2H2 type, domain proteins. 


BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e-ll 398-415 
BL00028 16.07 8.269e-ll 342-359 
BL00028 16.07 4.300e-10 229-246 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 6.1006-10258-275 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e^ 
14 637-651 PR00048A 10.52 
2.059e-12 609-623 PR00048A 
10.52 2.588e-12 469-483 
PR00048A 10.52 7.353e-12 553- 
567 PR00048A 10.52 2.895e-ll 
525-539 PR00048A 10.52 4.31 6e- 
1144M55 PR00048A 10.52 
5263e-l 1413-427 PR00048B 
6.02 2.125e-10 569-579 
PR00048B 6.02 4.938e-10 513- 
523 PR00048A 10.52 5.696e-10 
497-51 1 PR00048B 6.02 8.875e- 
10 429-439 PR00048B 6.02 
l.OOOe-09 457-467 PR00048B 
6.02 6.6846-09485-495 


103 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 11.94 5.364e-22 31-50 
PR00195B 9.47 1.783e-21 56-74 
PR00195C 11.50 3.4556-21 126- 
144 PR00195D 11.76 8.714e-21 
175-194 PR0O195F 16.20 8.500e- 
20 217-237 PR00195E9.82 
8.650e-20 194-211 


104 


BL01113 


Clq domain proteins. 


BL01113A 17.99 1.865e-09 121- 
148 BLOl 1 13A 17.99 5.846e-09 
82-109 


105 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 6.400e-l 1 70-99 
BL00420A 20.42 8.525e-10 73- 
102 BL00420A 20.42 5.708e-09 
85-114 


108 


PR00860 


VERTEBRATE METALLOTfflONEIN 

SIGNATURE 


PR00860B 7.04 2.929e-20 27-41 
PR00860A 5.46 5.500e-16 5-18 
PR00860C9.61 1.474e- 14 41-51 


112 


BL01031 


Heat shock hsp20 proteins family profile. 


BL01031C 17.68 6.400e-10 122- 
147 


114 


DM01840 


kw SPAC24B1 1.09 R07E5.13. 


DM01840B 22.04 2,688e-40 59- 
103 DM01840A 10.95 9.571e-13 
31-43 


115 


BL01126 


Elongation factor Ts proteins. 


BL01126A 18.48 2.3176-30 46-89 
BL01126B 13.15 7.387e-19 116- 
135 BLOl 126C 9.20 9.7356-1 1 
190-203 


116 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 4.3756-21 35-85 


118 


BL00437 


Catalase proximal heme-ligand proteins. 


BL00437A 18.82 l.OOOe-40 49- 
101 BL00437B 16.28 l.OOOe-40 
1 14-168 BL00437C 21 .86 1 .OOOe- 
40 190-239 BL00437D 25.72 
l.OOOe^O 248-301 BL00437E 
23.95 l.OOOe-40 327-379 


119 


BL00140 


Ubiquitin carboxyl-tenninal hydrolase 
family 1 cysteine activ. 


BL00140D 22.64 8274e-14 164- 
208 BL00140C 11.80 5.444e-10 
77-102 


120 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 6.712e-10 95- 

148 


122 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 l.OOOe-40 16-62 


123 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041D 7.95 2.906e-09 24-41 
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BINDING (CREB) PROTEIN 
SIGNATURE 




124 


PR00041 


CAMP RESPONSE ELEMENT 
BINDING (CREB) PROTEIN 
SIGNATURE 


PR00041D 7.95 2.906e-09 24-41 


125 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061C 7.86 3.250e-10212- 
222 


126 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 251-290 


127 


PR00318 


ALPHA G-PROTEIN (TRANSDUCIN) 
SIGNATURE 


PR00318D 16.28 1.900e-34219. 
248 PR00318B 14.79 3.455e-27 
168-191 PR00318C 12.09 7.000e- 

23 197-215 PR00318A7.84 
1.600e-19 35-51 PR00318E7.23 
2.500e- 12 265-275 


128 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 9.743e-10 67-89 
PR00927B 14.66 4.575e-09 69-91 


130 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824B 9.21 7.750e-22 133- 
153 


131 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824C 14.58 l.OOOe-40 166- 
204 BL00824D 14.04 1.621e-38 
204-239 BL00824B 9.21 7.750e- 
22 133-153 BL00824E 12.49 
l.OOOe-19 247-263 


132 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B4.88 9.222e-13 1209- 
1228 


133 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1 168- 
1187 


134 


PR00708 


ALPHA-] -ACID GLYCOPROTEIN 
SIGNATURE 


PR00708D 14.67 l.OOOe-27 141- 
168 PR00708C 11.77 1.643e-25 
98-120 PR00708B 15.15 2.174e- 
24 73-95 PR00708E 13.33 
1.600e-21 189-207 PR00708A 
14.40 2.636e-21 51-70 


135 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 8.468e-13 126- 
145 


136 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.250e-10 201- 
217 


137 


BL00471 


Small cytokines (intercrine/chemokine) 
C-x-C subfamily signat. 


BL00471 23.92 7.480e-10 42-90 


140 


PR00205 


CADHERIN SIGNATURE 


PR00205B 1 1.39 5.582e-10 328- 
346 PR00205B 11.39 9.018e-10 
543-561 


141 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 7.7046-09976- 

1027 


143 


PR00979 


TAFAZZIN SIGNATURE 


PR00979E 10.83 5.950e-26 192- 
214 PR00979A 11.91 8.773e-25 
63-83 PR00979C 12.16 6.400e-19 
108-124 PR00979D 12.38 7.955e- 
19 170-185 PR00979F 10.14 
3.3826-15 230-244 PR00979B 
15.59 5.636e-15 94-106 


145 


DM00686 


kw REPLICATION REP 28K 1 7.7K. 


DM00686C 14.14 7.720e-09 111- 
131 


146 


PR00604 


CLASS lA AND IB CYTOCHROME C 
SIGNATURE 


PR00604D 15.86 LOOOe-17 87- 
104 PR00604B 12.73 9.591e-16 
57-73 PR00604C 10.21 8.200e-12 
73-84 PRd0604E 10.13 l.OOOe-11 
106-117 PR00604A 11.13 8.800e- 
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11 44-52 PR00604F 8.60 l.OOOe- 

10 123-132 


. 147 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.864e.l5 266- 
297 BL00107B 13.31 6.143e-ll 
335-351 


148 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.448e-09 67-81 


149 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069D 19.36 1.857e-30 187^ 
217 PR00069A 16.01 7.429e-25 
41-66 PR00069E18.14 3.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 
8.071e-19 101-120 


150 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 2.688e-27 139-182 


151 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSEUDOURIDINE LYASE TR. 


PD02906C 24.17 7.070e-22 165- 
200 PD02906B 15.35 8.393e-15 
114-127 PD02906A 10,84 6.500e- 
09 71-84 


153 


BL00479 


Phorbol esters / diacylglyceroi binding 
domain proteins. 


BL00479A 19.86 5.091e.l2 891- 
914 BL00479B 12.57 1.837e-ll 
915-931 


158 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.786e-31 143-186 


160 


BL00422 


Granins proteins. 


BL00422C 16.18 7.750e-12 420- 
448 


162 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 9.297e-l 1 62-82 


164 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 6.182e-10 347- 
386 


166 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 83-97 
PR00860A 5.46 l.OOOe-18 61-74 
PR00860C9.61 1.900e- 15 97-107 


167 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.052e-09 196- 
218 


169 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 316- 
353 BL00514G 15.98 2.241e-34 
471-501 BL00514H 14.95 6.571e- 
27 510-535 BL00514E 14.28 
1.2736-16 388-405 BL00514D 
15.35 9.1006-15 369-382 
BL00514B 16.424.8576-14260- 
276 BL00514F 11.65 9.690e-14 
416-431 BL00514A 11.68 8.200e- 
11 149-159 


170 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.3466-39268- 

305 BL00514G 15.98 2.241e-34 
423-453 BL00514H 14.95 6.571e- 
27 462-487 BL00514E 14.28 
1.2736-16 340-357 BL00514D 
15.35 9.1006-15 321-334 
BL00514B 16.42 4.8576-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 
11 101-111 


171 


BL00514 


Fibrinogen beta and gamma chains C- 
temiinal domain proteins. 


BL00514G 15.98 2.2416-34 385- 
415 BL00514H 14.95 6.5716-27 
424-449 BL00514C 17.41 4.632e- 
24 230-267 BL00514E 14.28 
1.2736-16 302-319 BL00514D 
15.35 9.1006-15 283-296 
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BL005I4B 16.42 4.857e-l4 212- 
228 BL00514F 11.65 9.690e-14 
330-345 BL00514A 11.68 8.200e- 
11 101-111 


173 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.400e-29 1 19-162 


174 


DM01970 


0kwZK632.12YDR313C 
ENDOSOMAL IE. 


DM01970B 8.60 5.1 19e-15 1391- 
1404 


176 


BL00773 


Chitinases family 19 proteins. 


BL00773C 9.42 8.000e-092-16 


182 


PROG 109 


TYROSINE BONASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.163e-14 141- 
160 


. 183 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA-. 


PD01937A 6.68 3.475e-09 221- 
232 


185 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.946e-23 247-272 
BL00845 16.43 1.628e-21 107-132 


186 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 1 1 .65 6.538e-l 1 525- 
541 


187 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6,538e-l 1 497- 
513 


188 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM01 803 A 10.51 l.OOOe-09 
1081-1102 


189 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 5.0916-15 69-82 


190 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C 6.38 1.900e-35 145- 
174 PR00194E8.74 3.250e-30 
231-257 PR00194D9.57 1.500e- 
26 175-199 PR00194B 10.24 
5.200e-24 120-141 PR06l94A 
7.86 4.857e-21 84-102 


192 


PD02042 


IRON-SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 


PD02042B 16.75 5.154e-09 131- 
146 PD02042A 21.13 5.909e-09 
94-121 


193 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 2.200e-102-15 


195 


BL00463 


Fungal Zii(2)-Cys(6) binuclear cluster 
domain proteins. 


BL00463 8.22 5.071e-09 111-123 


196 


PR00118 


BETA-LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e.09 165- 
181 


197 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.424e-09 234- 

267 


198 


BL00660 


Band 4.1 family domain proteins. 


BL00660A 31.50 5.500e-ll 714- 

767 


199 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.820e-13 70-93 


202 


PR00009 


TYPE I EGF SIGNATURE 


PR00009A 14.15 5.345e-15 971- 
987 PR00009C 14.11 8.773e-13 
996-1008 PR00009D 16.83 
8.000e-ll 1008-1018 PR00009C 
14.11 L882e-09 892-904 


203 


BL00025 


P-type Trefoil' domain proteins. 


BL00025 17.17 4.536e-19 38-59 


205 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 7.300e-10 165-178 


206 


PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.865e-ll 67-86 


207 


BL00025 


P-type Trefoil' domain proteins. 


BL00025 17.17 3.423e-20 39-60 
BL00025 17.17 8.750e-16 88-109 


209 


BL00646 


Ribosomal protein S13 proteins. 


BL00646B 21.42 6.100e-30 110- 
143 BL00646A 25.82 6.192e-29 
14-62 


210 


PR00138 


MATRDON SIGNATURE 


PR00138D 16.56 3.605e-25 279- 
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305 PR00138C 16.41 3.000e-24 
218-247 PR00138E6.018.714e- 
13 314-328 PR00138A 15.14 
9.538e-13 134-148 PR00138B 
15,82 4.522e-12 188-204 


211 


DM01206 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.429e-12 386- 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
5.068e-10 388-408 


212 


PD01941 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 


PD01941A 14.81 LOOOe-40 163- 
217 PD01941B 15.02 9.705e-30 
420-467 PD01941E 15.92 8.714e- 
23 837-884 PD01941C 19.96 
8.200e-20 508-563 PD01941D 
27.18 1.600e-16 661-710 
PD01941F 28.52 9.645e-15 1005- 
1060 


213 


BL00362 


Ribosomal protein S 1 5 proteins. 


BL00362 24.67 8.3 13e-09 330-373 


214 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BL00115Z3.12 2.125e-09 1178- 
1227 BLOOl 15Z 3.12 6.096e-09 
1164-1213 


215 


BL00038 


Myc-type, 'helix-loop-helix' dimerization 
domain proteins. 


BL00038B 16.97 7.600e-18 125- 
146 BL00038A 13.61 1.474e-13 
102-118 


216 


BL01108 


Ribosomal protein L24 proteins. 


BLOl 108A 20.33 2.241e-22 49-82 
BLOl 108B 1 1.40 8.457e-10 96- 
107 


217 


PR00381 


KINESIN LIGHT CHAIN SIGNATURE 


PR00381A 9.55 1.321e-10360- 
378 


222 


BL00514 . 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 2.358e-26 1 166- 
1203 BL00514G 15.98 9.000e-15 
1289-1319 BL00514D 15.35 
6.936e-12 1207-1220 BL00514F 
11,65 4.288e-10 1253-1268 
BL00514H 14.95 8.636e-10 1318- 
1343 


223 


BL00325 


Actin-depolymerizing proteins. 


BL00325B 21.66 l.OOOe-40 93- 
139 BL00325A 24.83 9.333e-24 
61-93 


224 


BL00018 


£F-hand calcium-binding domain 
proteins. 


BLOOOl 8 7.4 1 1 .450e-l 0 23 1 -244 


225 


PF01329 


Pterin 4 alpha carbinolamine dhydratase. 


PF01329B 18.52 1.692e-18 67-92 


228 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 6.250e-18 1033- 
1065 BL0021 IB 13,37 8.875e-18 
2045-2077 BL00211A 12.23 
1.900e-09 93 1-943 


230 


PR00761 


BINDIN PRECURSOR SIGNATURE 


PR00761A 5.81 9.366e-09 275- 
292 


231 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.500e-10 54-69 


232 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 1.978e-10 109- 
160 BL00412D 16.54 4.122e-09 
133-184 


233 


BL01210 


Caveolins proteins. 


BL01210B 13.92 8.129e-09 106- 
156 


236 


BL00939 


Ribosomal protein Lie proteins. 


BL00939F 17.27 5.393e-09 861- 
891 


238 


BL01252 


Endogenous opioids neuropeptides 
precursors proteins. 


BL01252D 18.25 3.571e-28 205- 
233 BL01252B 19.09 5.034e-27 
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37-67 BL01252C 18.10 1.6216-21 
164-190 BL01252A 14.22 7.107e- 
18 14-34 


239 


BL00302 


Eukaryotic initiation fector 5 A hypusine 
proteins. 


BL00302 14.81 l.OOOe-40 25-79 


240 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 8.851e-13 26-49 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR L 


PD02929A 28.27 4.529e-09235- 
289 


. 243 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.527e-25 11-50 


244 


BL01270 


Band 7 protein family proteins. 


BL01270C 16.91 6.745e-17 115- 
144 BL01270B 18.74 6.857e-17 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D 20.87 
9.160e-13 144-182 


245 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 6.305e-12 253- 
308 PF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.651e- 
09 179-234 PF00791B 28.49 
3,890e-09 112-167 


246 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 2.500e-13 277-290 
PD00066 13.92 9.143e-12 193-206 
PDO0O66 13.92 5.304e-ll 165-178 
PD00066 13.92 6.478e- 11 249-262 
PD00066 13.92 3. 423e- 10 221-234 


247 


BL00406 


Actins proteins. 


BL00406D 12.58 6.400e-20 465- 

520 BL00406B5.47 4.857e-14 
249-304 BL00406E 8.441. OOOe- 
11 522-572 BL00406C6.75 
5.449e-l 1313-368 


248 


BL00951 


ER lumen protein retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe-40 112- 
161 BL00951A 15.10 7.750e-39 
21-57 BL00951D 13.94 6.000e-38 
161-196 BL00951B 14.23 3.100e- 
3157-88 


252 

• 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.129e-15 200- 
227 BL01113A 17.99 4.818e-14 
194-221 BL01113A 17.99 7.818e- 
14 182-209 BL01113A 17.99 
1.730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BLOl 1 13A 17.99 6.077e-12 203- 
230 BLOl 113A 1.7.99 9.182e-ll 
179-206 BL01113A 17.99 2.532e- 
10 176-203 BLOl 113A 17.99 
9.043e-10 218-245 BL01113A 
17.99 9.4266-10 209-236 
BL01113A 17.994.1156-09 137- 
164 


257 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 1.837e-21 466-491 


259 


PR00248 


METABOTROPIC GLUTAMATE 
GPCR SIGNATURE 


PR00248G 12.67 2.688e-09 53-78 


260 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 441-452 
BL00678 9.67 5.800e-l 0481-492 
BL00678 9.67 8.800e-10 358-369 


261 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 415-426 
BL00678 9.67 5.800e-10 455-466 



160 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00678 9.67 8.800e-10 332-343 


262 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 468-479 
BL00678 9.67 5.800e-10 508-519 . 
BL00678 9.67 8.800e-10 385-396 


263 


BL50002 


Src homology 3 (SID) domain proteins 
profile. 


BL50002B 15.18 2.20Qe-10 415- 
429 


264 


BL00049 


Ribosomal protein L14 proteins. 


BL00049C 17.38 3.040e-12 94- 
130 


265 


PD01469 


GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 


PD01469 20.59 2.091e-14 438-470 


266 


PD01469 


GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 


PD01469 20.59 2.091e-14 279-31 1 


267 


BL00567 


Phosphoribulokinase proteins. 


BL00567A 10.66 1.161e-12 36-55 


269 


BL00049 


Ribosomal protein L14 proteins. 


BL00049C 17.38 2.688e-28 92- 
128 BL00049B 1842 6.806e-24 
54-86 BL00049A 13.86 8.333e-19 
19-42 BL00049D 13.47 5.765e-12 
129-140 


272 


BL01115 


GTP-binding nuclear protein ran proteins. 


BLOl 1 15A 10.22 9.735e-12 14-58 


273 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 1.91 le-09 819- 

832 


275 


PR00179 


LIPOCALIN SIGNATURE 


PR00179B9.56 2.895e-13 124- 
137 PR00179A 13.78 3.250e-l 1 
36-49 PR00179C 19.02 6.040e-ll 
154-170 


276 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 8.364e-17 22-44 
PR00449C 17.27 l.OOOe-13 62-85 
PR00449E 13.50 4.000e-12 172- 
195 PR00449B 14.34 5.680e-10 
45-62 


277 


BL00140 


Ubiquitin carboxyl-terminal hydrolase 
family 1 cysteine activ. 


BL00140D 22.64 l.OOOe-40 161- 
205 BL00140C 11.80 9.053e-30 
79-104 BL00140A 15.96 9.400e. 
28 5-35 BL00140B 12.29 4.649e- 
17 37-55 


278 


PD02712 


ELEMENT TRANSPOSASE FOR 
TRANSPOSON TRANSPOSABLE. 


PD02712A 23.03 8.013e-09 47-83 


279 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1.474e-09 100-111 


282 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.767e-21 864- 
898 


283 


BL00048 


Protamine PI proteins. 


BL00048 6.39 9.550e-09 56-83 


286 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 1.878e-ll 36-54 


287 


PR00310 


ANTI-PROLIFERATIVE PROTEIN 
BTGl FAMILY SIGNATURE 


PR00310B 10.59 4.23 le-17 29-59 
PR00310D 9.10 6.679e-16 89-1 19 


289 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-36 37-76 


293 


BL00979 


G-protein coupled receptors family 3 
proteins. 


BL00979L 20.63 3.800e-12 1 1 1- 
152 


295 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e-16 195-229 


296 


BL01064 


Pyridoxamine 5 -phosphate oxidase 
proteins. 


BL01064A 27.84 8.313e-28 77- 
129 BL01064C 15.22 7.136e-25 
202-235 


297 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 2.929e-13 37-56 
BL00030B 7.03 1.900e-ll 167- 
177 BL00030A 14.39 2.000e.l0 
128-147 
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298 


BL01183 


ubiE/C0Q5 methyltransferase family 
proteins. 


BLOl 183B 21.31 6.660e-12 143- 
188 


299 


BL01279 


Protein-L-isoaspartate(D-aspartate) 0- 
methyltransferase signa. 


BL01279A 24.27 5.862e-l 1 57- 
105 


301 


BL00191 


Cytochrome b5 family, heme-binding 
domain proteins. 


BL00191K 17.38 4.951e-27 184- 
228 BL00191JlL37 6.447e-17 
128-150 


302 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 3.893e-16 33-67 


306 


PF01140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 2.988e-09 416- 
451 


. 307 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 4.818e-21 59-81 
PR00245C 7.84 5.154e-20238- 
254 PR00245D 10.47 4.000e-15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-12 291-306 


309 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 2.245e-10 612-658 


310 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 119- 
159 BL00237C 13.19 3.864e-15 
251-278 BL00237D 11.23 3.739e- 

12 312-329 


311 


BL00380 


Rhodanese proteins. 


BL00380D 15.90 8.200e-28 110- 
136 BL00380G 11.26 5,800e-16 
267-280 BL00380B 14.77 7.000e- 
14 49-62 BL00380F 9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.387e-13 82-98 BL00380E 12,44 
7.000e-ll 181-193 BL00380A 
10.48 l.OOOe-09 10-20 


312 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 l.OOOe-40 50- 
105 BL00227C 25.48 1.000e-40 
111-163 BL00227D 18.46 l.OOOe- 
40220-274 BL00227F 21.16 
L000e.40 372-426 BL00227A 
24.55 3.250e-39 1-35 BL00227E 
24.15 8.500e-34 324-359 


327 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 7.362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.301e- 
15 116-164 BL00232B 32.79 
6.769e-13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 
346 BL00232C 10.65 3.942e-10 
433-451 


329 . 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 


PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A9.56 6.000e-15 2- 
15 


330 


PR00391 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN SIGNATURE 


PR00391E 12.50 7.785e-15 211- 
231 PR00391B 8.39 l.OOOe-13 
83-104 PR00391D 12.21 9.328e- 
13 191-207 PR00391A 7.83 
5.390e-ll 16-36 


332 


BL01030 


RNA polymerases M / 15 Kd subunits 
proteins. 


BL01030 23.44 1.818e-23 87-125 


337 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


340 


PD02711 


SYNTHASE 


PD02711B 14.26 1 .973e-20 944- 
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PHOSPHORIBOSYLFORMYLGLY. 


968 


343 


BL00223 


Annexins repeat proteins domain * 
proteins. 


BL00223C 24.79 1 .000e-40 245- 
300 BL00223B 28.47 8.714e-38 
168-218 BL00223A 15.59 8.250e- 
27 98-132 BL00223A 15.59 
8.750e-27 26-60 BL00223C 24.79 
9.438e-16 13-68 BL00223C 24.79 
2.735e-15 85-140 BL00223A 
15.59 2.253e-l 1258-292 


346 


PR00345 


STATHMIN FAMILY SIGNATURE 


PR00345B 7.12 2.800e-28 81-1 10 
PR00345E 8.54 7.652e-28 158- 
183 PR00345C4.54 9.100e-28 
110-134 PR00345D 10.97 1.964e- 
24 134-158 PR00345A 13.46 
5.645e- 16 52-71 


347 


BL00586 


Ribosomal protein LI 6 proteins. 


BL00586B 17.00 3215e-15 184- 
221 


348 


PR00388 


3\5'-CYCLIC NUCLEOTIDE CLASS II 
PHOSPHODIESTERASE SIGNATURE 


PR00388A 10.45 2.778e-09 86- 
105 


351 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2.350e-l 0 244-257 


354 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1.947e-09 256-267 


358 


DM01206 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 3.278e-09 175- 
195 DM01206B 10.69 6.696e-09 
183-203 DM01206B 10.69 
8.633e-09 132-152 DM01206B 
10.69 8.861e-09 181-201 
DM01206B 10.69 9.316e-09 177- 
197 


361 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PDO 1498C 24.90 6.880e- 14 2 1 9- 
263 


362 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PD01498C 24.90 6.880e-14 219- 
263 


365 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.1 1 LOOOe-1 1 589- . 
600 BL00178A 14.23 8.500e-09 
46-56 


366 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 l.OOOe-23 318- 
348 BL00523A 13.36 5.500e-16 
30-47 BL00523B8.64 L964e-13 
78-90 BL00523C 12.64 9.625e-13 
129-140 BL00523G9.46 5.500e- 
10 506-516 


369 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.818e-0921-52 


370 


BL00880 


Acyl-CoA-binding protein. 


BL00880 17.52 l.OOOe-40 75-125 


371 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 l.OOOe-23 276- 
307 BL00107B 13.31 1.692e-12 
342-358 


372 


PR00211 


GLUTELIN SIGNATURE 


PR0021 IB 0.86 6.602e-l 1 326- 
347 PR00211B0.86 6.106e-10 
320-341 PR00211BO.86 3.167e- 
09 333-354 


373 


BL00279 


Membrane attack complex components / 
perforin proteins. 


BL00279E 37.1 1 9.3496-10 749- 
797 


375 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1231e-33 10-49 


377 


- PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.563e-28 10-49 


379 


BL00598 


Chromo domain proteins. 


BL00598 14.45 5.781e-16 3-25 
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380 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 1128 8.941e-09 864- 

878 . • 


383 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 11.28 8.941e-09 864- 
878 


387 


BL01060 


Flagella transport protein fliP family 
proteins. 


BL01060A 15.65 1.535e-09 131- 
174 


388 


PR00209 


ALPHA/BETA GLL\DIN FAMILY 
SIGNATURE 


PR00209B 4.88 6.318e-l 1 1009- 
1028 


389 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837B 11.64 l.OOOe-10469- 
483 


391 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.907e-10 118- 
142 


392 


PR00014 


FIBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 691- 

706 


393 


PR00014 


FIBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 706- 
721 


394 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3.368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 


395 


BL00634 


Ribosomal protein L30 proteins. 


BL00634 34.38 4.090e-13 70-121 


396 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013D 26.81 8.000e-26 358- 
402 BL01013A 25.14 7.231e-21 
45-81 BL01013C9.97 LOOOe-13 
132-142 BL01013B 11.33 l.OOOe- 
11 110-121 


397 


BL00930 


Peripherin / rom- 1 proteins. 


BL00930E 17.80 l.OOOe-40 56-92 
BL00930D 9.12 4.632e-37 12-56 
BL00930F 16.91 2.800e-36 92- 

133 


400 


PR00780 


LEUSERPIN 2 SIGNATURE 


PR00780B 4.89 4.491e-09262. 

285 


401 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e-ll 4-20 


403 


BL003 81 


Endopeptidase Clp serine proteins. 


BL00381C 23.84 1.250e-32 150- 
194 BL00381A 16.48 2.286e-22 
74-1 1 1 BL00381B 21.42 8.326e- 
14 78-130 


405 


BL01105 


Ribosomal protein L35Ae proteins. 


BL01105A 17.37 LOOOe-404-49 
BL01105B 12.95 l.OOOe-40 68- 
108 


406 


BL00344 


GATA-type zinc finger dom&in proteins. 


BL00344 17.99 7.000e-12 814-852 


407 


PR00211 


GLUTELIN SIGNATURE 


PR0021 IB 0.86 9.750e-09 73-94 


409 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A2.51 4.321e-099-22 


410 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 l.OOOe-28 752- 
789 BL00762A 23.43 4,400e-21 
903-940 BL00762A 23.43 5.415e- 
18 825-862 BL00762B 16.14 
8.759e-12 1154-1168 


412 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


BL00690B 13.38 5.320e-15 262- 
280 BL00690A6.87 1.818e.l3 

230-240 


415 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 l.OOOe-40 52- 
107 BL00227C 25.48 l.OOOe^O 
113-165 BL00227D 18.46 l.OOOe- 
40 222-276 BL00227F 21.16 
l.OOOe-40 382-436 BL00227E 
24.15 1.750e-34 326-361 
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BL00227A 24.55 l.OOOe-33 1-35 


416 


PF00992 


Troponin. 


PF00992A 16.67 1.711e-09 557- 
592 


418 


BL00541 


Nuclear transition protein 1 proteins. 


BL00541 8.44 9.875e-09 256-3 10 


419 


BL00541 


Nuclear transition protein 1 proteins. 


BL00541 8.44 9.875e-09 197-251 


420 


PF00856 


SET domain proteins. 


PF00856A 26.14 9.074e-13 901- 
938 PF00856B 16.42 2.3976-12 

951-973 


421 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 8.200e-12 33-44 


423 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.600e-30 130-169 


424 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 1.305e-17 421- 
472 


426 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


427 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


428 


BL00478 


LIM domain proteins. 


BL00478B 14.79 3.250e-13 115- 
130 BL00478B 14.79 9.036e-13 

50-65 


431 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.875e-12 464-487 


432 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.617e-12 
125-151 PD00930B 33.72 2.521e- 
10 214-255 


433 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.649e-34 34-73 


434 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.563e-ll 56-78 


436 


PR00120 


H+-TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e-19 705- 
722 


437 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


. BLOOl 15T 8.45 7.273e-29 1208- 
1242 BLOOl 15Q 1 8.08 2.776e-21 
953-983 BLOOl 15Y 11.86 8.000e- 
17 1604-1650 BLOOl 15M 19.19 
8.130e-16 731-774 BL00115H 
14.34 9.392e- 16 463-496 
BL00115A 15.44 7.414e-15 43-82 
BLOOl 15R 6.50 6.128e-14 983- 
1010 BL00115J 16.71 9.289e-14 
591-617 BL00115I8.33 4.336e- 
13 535-590 BLOOl 15L 12.25 
5.939e-13 662-694 BL00115G 
11.65 6.01 le-13 435-463 
BLOOl 15K 15.03 3.417e-10 617- 
659 BLOOl 150 16.76 5.805e-10 
863-913 BLOOl 15P 11.54 7.538e- 
10 913-953 BL00115S 18.24 
7.968e-10 1010-1052 BL00115U 
1 0.34 4.475e-09 1242-1265 


438 


PF00628 


PHD-finger. 


PF00628 15.84 4.536e-10 219-234 


440 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.351e-34 10-49 


441 


PR00309 


ARRESTIN SIGNATURE 


PR00309A 9.68 5.250e-24 32-55 
PR00309D 7.09 4.938e-23 290- 
309 PR00309B 7.81 2.800e-21 
69-88 PR00309C8.22 1.621e-19 
165-183 PR00309E 9.82 9.438e- 
15 374-389 


442 


BL00600 


Aminotransferases class-Ill pyridoxal- 


BL00600B 19.60 7.324e-14 103- 
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phosphate attachment si. 


129 BL00600G 12.43 2.125e-12 
306-325 BL00600F 8.77 8.105e- 
12 271-284 BL00600E 16.43 
3.167e-l 1228-257 BL00600D 
8.71 8.650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 3.160e-18 69-87 


444 


BL00349 


CTF/NF-I proteins. 


BL00349A 10.07 l.OOOe-40 8-54 
BL00349C9.33 l.OOOe-40 82-125 
BL00349E 10.79 l.OOOe-40 152- 
195 BL00349F 11.81 l.OOOe-40 
213-255 BL00349H 15.70 7.387e- 
36 361-399 BL00349B 10.51 
2.227e-34 54-82 BL00349D 1 1 .70 
9.100e-34 125-152 BL00349G 
19.72 5.781e-30 323-356 


445 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154F 8.23 8.941e-21 271- 
295 BL00154E 20.37 2.620e-15 
124-165 


448 


•DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.882e-ll 82-115 

DM00215 19.43 6.492e-09 87-120 


451 


BL01283 


T-box domain proteins. 


BL01283A 24.15 3.100e-40 112- 
160 BL01283D lL70 6.000e-39 
253-286 BL01283B 23.17 6:538e- 
38 170-212 BL01283C 13.05 
7.7506-19 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 2.579e-ll 3-26 


. 453 


PR00162 


RIESKE 2FE-2S SUBUNIT 
SIGNATURE 


PR00162B 12.77 7.429e-17 215- 
228 PR00162A9.35 2.324e-14 
193-205 PR00162C8.10 7.120e- 
14227-240 


454 


EDO 1066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-30 87-126 


456 


BL00027 


'Homeobox* domain proteins. 


BL00027 26.43 9.333e-18 1149- 
1192 


457 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.737e-24 16-55 


459 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 1.529e-14 154- 
177 BL00290B 13.17 9.000e-12 

214-232 


460 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413F 14.91 7.333e-ll 193- 
214 PR00413E 15.78 5.714e-09 
175-192 


463 


PR00759 


BASIC PROTEASE (KUNITZ-TYPE) 
INHIBITOR FAMILY SIGNATURE 


PR00759B 1 1 .26 8.385e-09 74-85 


466 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


467 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19300- 
330 


469 


PR00153 


C YCLOPHILIN PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE 
SIGNATURE 


PR00153D 1 1.99 3.250e-15 510- 
523 PR00153C 11,01 4.682e-14 
495-511 PR00153E9.10 8.548e- 
14 523-539 PR00153B 11.57 
1.720e-13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL00491C 12.15 3.912e-09 557- 

572 


471 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 


PD00289 9.97 l.OOOe-14 1482- 
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PRESYNA. 


1496 PD002899.97 8.650e-ll 

1122-1136 


. 474 


BL50040 


Elongation factor 1 gamma chain profile. 


BL50040D 17.41 l.OOOe-40 279- 
329 BL50040E 18.79 l.OOOe-40 
333-388 BL50040F 18.99 5.320e- 
40 390-428 BL50040C 22.62 
3.739e-38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 


475 


BL01144 


Ribosomal protein L3 1 e proteins. 


BLOl 144 25.07 l.OOOe-40 22-74 


476 


PR00007 


COMPLEMENT CIQ DOMAIN 
SIGNATURE 


PR00007C 15.60 2.421e-21 589- 
611 PR00007B 14.16 3.500e-21 

544-564 PR00007A 19.33 6.897e- 
20 517-544 PR00007D9.64 
6.571e-12 623-634 


477 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 5.846e-10 170- 
189 


479 


DM01970 


OkwZK632.12YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 9.500e-17 967- 
980 


480 


PR00868 


DNA-POLYMERASE FAMILY A (POL 
I) SIGNATURE . 


PR00868C 13.76 5.688e-17 284- 
308 PR00868A 16.33 3.186e-13 
224-247 PR00868H 12.51 3.388e- 
13 431-448 PR008681 10.87 
7.938e-ll 462-476 PR00868E 
13.19 1.608e-10 340-366 


481 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.182e-22 53-96 


482 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061B 25.79 3.647e-21 188- 
226 


483 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 1.750e-12 1032- 
1051 


.485 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 9.625e-10 760- 
776 PF00023A 16.03 3.571e-09 
715-731 


486 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 9.262e-20 103- 
136 PD02870D 15.74 9.426e-09 
201-236 


487 


PR00370 


FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370G 10.45 3.769e-28 471- 
493 PR00370B 10.91 l.OOOe-24 
27-46 PR00370C 12.72 4.000e-21 
140-157 PR00370E 11.96 9.229e- 
21 320-339 PR00370D 16.33 
1.750e-20 185-204 PR00370F 
17.75 7.395e-20 375-395 
PR00370A 3.35 2.038e-18 4-20 


489 


PD01675 


GLYCOPROTEIN MAJOR ENVELOPE 
PROBABLE U3. 


PD01675C 19.89 2.330e- 10 55-89 


492 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.050e-09 45-57 


493 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.0506-09 45-57 


494 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.050e-09 58-70 


495 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9. 143e- 12 319-362 
BL00027 26.43 2.600e-1 1 627-670 
BL00027 26.43 3.625e-10 779-822 


497 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.800e-22 214- 
245 BL00107B 13.31 l.OOOe-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.615e-12 652-668 


499 


BL00383 


Tyrosine specific protein phosphatases 


BL00383E 10.35 l.OOOe-14 1902- 
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proteins. 


1913 BL00383D 11.92 3.077e-14 
1862-1875 BL00383A 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 1.692e-ll 
1755-1764 


501 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e-09 136- 
150 PR00019A 11.19 1.667e-09 
91-105 PR00019B 11.36 4.600e- 
09 160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 l.OOOe-40 367- 
414 BL00226B 23.86 6.143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 13.23 
2.600e-13 309-340 BL00226C 
13.23 6.1436-12266-297 
BL00226B 23.86 1.209e-09 146- 
194 


505 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407F 7.61 6.739e-09 916- 

930 


506 


PF00632 


HECT-domain (ubiquitin-transferase). 


PF00632C 20.66 9.830e-19 991- 
1023 PF00632B 18.45 1.155e-ll 
940-968 


507 


BL01082 


Ribosomal protein L7Ae proteins. 


BL01082 20.37 4.273e-20 76-1 16 


508 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.194.774e-ll 567- 
582 PR00320B 12.19 5.886e-10 
763-778 PR00320C 13.01 6.760e- 
10 567-582 PR00320A 16.74 
7.618e-10 846-861 PR00320A 
16.74 3.4156-09763-778 
PR00320A 16.74 6.268e-09 567- 
582 


511 


BL00479 


Phorbol esters / diacylglycerol binding 

domain proteins. 


BL00479C 12.01 3.250e-12 170- 

183 


512 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 7.494e-09 10-58 


513 


BL00524 


Somatomedin B domain proteins. 


BL00524A 9.65 8.925e-14 80-101 


515 


BL00041 


Bacterial regulatory proteins, araC &mily 
proteins. 


BL00041 23.99 1.964e-19 492-524 


516 


PD00066 


PROTEIN ZmC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.500e-13 391-404 


517 


BL00415 


Syn^sins proteins. 


BL00415E 4.82 9.2916-09 959- 
996 


518 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126- 
145 


519 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 4.750e-09 47-65 


522 


PR00505 


D12 CLASS N6 ADENINE-SPECIFIC 
DNA METHYLTRANSFERASE 
SIGNATURE 


PR00505A 14.15 7.1286-09364- 
381 


525 


BL00312 


Glycophorin A proteins. 


BL00312B 922 5.781e-10 891- 
920 


528 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.500e-32 16-55 


529 


PR00254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e- 17 131- 
150 PR00254A 11.23 4.706e-14 
6 1-78 PR00254C 1 1 .36 4.000e-12 
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113-126 PR00254B 12.97 1.486e- 
11 95-110 


. 531 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 6.870e-16 787- 
810 


532 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.143e-34 447- 
476 PR00193C 12.60 7.632e-32 
216-244 PR00193B 11.69 7.750e- 
29 167-193 PR00193A 15.41 
2.588e-22 111-131 PR00193E 
19.47 2.200e-21 501-530 


533 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 5.596e-09 348- 
381 


535 


PR00683 


SPECTRIN PLECKSTRIN 
HOMOLOGY DOMAIN SIGNATURE 


PR00683D 15,87 2.452e-10 465- 
484 


536 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.684e-24 164-207 


538 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 2.739e-09 225- 

237 


539 


BL00406 


Actins proteins. 


BL00406C 6.75 l.OOOe-40 157- 
212 BL00406B5.47 6.143e-37 
90-145 BL00406D 12.58 4.600e- 
36 291-346 BL00406E8.44 
2.200e-33 364-414 BL00406A 
9.95 4.441e-23 7-42 


540 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.625e-10 44-59 


541 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.625e-I0 44-59 


542 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.857e-ll 138- 

154 


544 


PF00642 


Zinc fmger C-x8-C-x5-C-x3-H type (and 
similar). 


PF00642 1 1.59 9.082e-10 838-849 


546 


BL00383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 4.1156-10 104- 
115 


547 


BL01226 


Hydroxymethylglutaryl-coenzyme A 
synthase proteins. 


BL01226A 13.79 LOOOe-40 50-89 
BL01226C 13.51 l.OOOe-40 127- 
167 BL01226D 11.60 l.OOOe-40 
174-210 BL01226E 13.74 LOOOe- 
40 212-253 BL01226H 17.74 
l.OOOe-40 386-434 BL01226I 
25.06 l.OOOe-40 460-508 
BL01226G 15.76 3.483e-32 292- 
321 BL01226B 13.35 1.818e-31 
95-127 BL01226F9.78 8.714e-23 
253-271 


549 


BL00964 


Syndecans proteins. 


BL00964B 12.05 2.426e-10 1246- 

1289 


551 


DM0I930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e-37 170- 
215 DM01930F 14.16 8.232e-28 
267-303 DM01930B 19.86 
9.163e-10 37-71 


552 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e-09 9-29 


554 


BL00383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 2.756e-12 436- 
447 


555 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 7.612e-ll 122- 
137 PR00403A 16.82 3.912e-10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 


558 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 2.714e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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297 PR0038OC 13.18 5.154e.20 
226-245 PR00380B 12.64 9.400e- 
20 195-213 


559 . 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 5.333e-09 522-531 


561 


PD01795 


PROTEIN AMINOPEPTIDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B 11.562.333e-12 159- 
172 PD01795A 10.27 l.OOOe-09 
135-144 


562 


PD01795 


PROTEIN AMINOPEPTIDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B 11.56 2.333e-12 110- 
123 PD01795A 10.27 1.000e-Q9 
86-95 


563 


BL00018 


BF-hand calcium-binding domain 

proteins. 


BL00018 7.41 1.391e-09 41-54 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 4.143e-09 188- 
231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BL 


PD00301B 5.49 4.1 15e-09 284- 

295 


569 


PF00850 


Histone deacetylase &mily. 


PF00850E 8.88 6.553e-21 756-782 
PF00850D 14.76 1.519e-16 722- 
746 PF00850F 15.70 1.118e-ll 
794-827 PF00850G 22.75 8.375e- 
11 833-875 


570 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 

PRESYNA. 


PD00289 9.97 4.960e-10 137-151 


571 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.800e-l 1 44-53 


573 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 1.123e-ll 123-175 


574 


PF01140 


Matrix protein (MA), pi 5. 


PFOl 1400 15.54 3.700e-10 986- 
1021 


576 


BL00284 


Serpins proteins. 


BL00284C 28.56 5.200e-26 200- 
242 BL00284A 15.64 4.913e-18 

71-95 BL00284B 17.99 7.261e-15 
173-194 BL00284D 16.34 5.846e- 
13 306-333 BL00284E 19.15 
7.4296-12 387-412 


579 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.553e-29 15-54 


580 


BL50001 


Src homology 2 (SH2) domain proteins 
profile. 


BL50001B 17.40 4.500e-l2 1010- 
1031 


581 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 3.189e-22 608- 
649 PD00930A 25.62 6.806e-17 
505-531 


584 


BL00612 


Osteonectin domain proteins. 


BL00612B 1 1.35 2.034e-l 1 93- 
126 


585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.859e-10 102- 

122 


586 


PF00628 


PHD-finger. 


PF00628 15.84 3.4556-12 235-250 


587 


BL00027 


•Homeobox' domain proteins. 


BL00027 26.43 6.063e-lO 85-128 


588 


PR00326 


GTPl/OBG GTP-BINDING PROTEIN 
FAMILY SIGNATURE 


PR00326A 8.75 7.525e-16227- 
248 PR00326C9.79 6.760e-15 
276-292 PR00326D 19.09 6.657e- 
13 293-312 PR00326B 16.74 
9.229e- 13 248-267 


589 


BL00422 


Granins proteins. 


BL00422A 28.34 7.429e-09 2349- 
2378 


590 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e-10 295- 
339 


591 


BL00128 


Alpha-lactalbumin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2,980e-ll 110- 
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132 


596 


PR00049 


WILM-S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.136e-09 31-46 


597 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547C 17.30 1.667e-19207- 
229 DM00547E 13.94 6.200e-18 
319-342 DM00547B 11.28 

I. OOOe-17 179-193 DM00547D 

II. 60 9.2506-13289-303 
DM00547F 23.43 6.727e-12 679- 
726 DM00547A 12.38 4.8186-11 
158-170 


600 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU; 


PD01066 19.43 1.882e-27 13-52 


601 


BL00192 


Cytochrome b/b6 heme-ligand proteins. 


BL00192A 11.90 6.400e-09 390- 
430 


602 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 2727 8.61 5e-09 1 1 8- 
157 


603 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 118- 
157 


606 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.3006-10292- 
306 PR00019A 11.19 5.667e-09 

323-337 


607 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 1 1.36 7.300e-10 292- 
306 PR00019A 11.195.6676-09 
323-337 


608 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 9.500e-12 168- 
183 PR00320A 16.742.8536-10 
60-75 PR00320A 16.74 4.7066-10 
14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 


610 


BL00750 


Chaperonins TCP-1 proteins. 


BL00750B 16.17 l.OOOe-40 70- 
120 BL00750A 20.07 6.2116-37 
26-69 BL00750G 20.12 8.800e-31 
431-471 BL00750F18.40 5.125e- 
30 370-411 BL00750E 24.59 
8.650e-29 295-332 BL00750H 
21.44 l.OOOe-27 489-524 
BL00750C 25.65 5.345e-17 149- 
181 BL00750D 16.16 6.318e-14 
203-222 


613 


BL00766 


Tetrahydrofolate i 
dehydrogenase/cyclohydrolase proteins. 


BL00766B 24.49 l.OOOe-40 142- 
190 BL00766E 13.78 l.OOOe-40 
322-359 BL00766C 25.86 5.5O0e- 
39 208-256 BL00766D 17.05 
4.536e-26 283-313 BL00766A 
21.48 6.0636-24102-132 


615 


BL00256 


Adipokinetic hormone family proteins. 


BL00256 12.28 3.2986-10746-755 


616 


BL00319 


Amyloidogenic glycoprotein extracellular 
domain proteins. 


BL00319C 17.12 9.053e-09 419- 
453 


617 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 4.4296-0944-63 


618 


BL00030 


Eukaiyotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.394.4296-0944-63 


620 


BL00325 


Actin-depolymerizing proteins. 


BL00325B 21.66 5.817e-16 77- 
123 


622 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 


BL00972A 11.93 5.500e-19 2l3- 
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family 2 proteins. 


231 BL00972D 22.55 2.742e-16 
501-526 BL00972B9.45 l.OOOe- 
11 297-307 BL00972C 16.48 
3.160e-ll 370-385 BL00972E 
20.72 7.5 17e-l 0526-548 


625 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.333e-39 6-45 


628 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039D 21.67 7.750e-31 478- 
524 BL00039A 18.44 2.000e-25 
198-237 BL00039C 15.63 1.844e- 
15 327-351 BL00039B 19.19 
5.636e-14 242-268 


630 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12232- 
246 


631 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12 290- 
304 


633 


BL00785 


5'-nucleotidase proteins. 


BL00785C 9.45 3.625e-16 108- 
122 BL00785E 15.85 4.000e-16 
279-295 BL00785A 9.73 6.500e- 
14 29-40 BL00785B 10.65 
5.5006-13 72-86 BL00785D9.89 
4.0006-12135-145 


636 


PR00832 


PAXILLIN SIGNATURE 


PR00832E 14.43 9.901e-14 85- 
108 


637 


PROG 109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 6.362e-13 221- 
240 


638 


PF00635 


MSP (Major sperm protein) domain 
proteins. 


PF00635B 15.844.9006-11 463- 
502 


639 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 1.9006-18 85-99 
PR00860C 9.61 1.4746-14 99-109 
PR00860A 5.46 1.720e-14 63-76 


641 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 4.462e-15 271-284 
PD00066 13.92 4.462e-15 299-312 
PD00066 13.92 2.800e-14 327-340 
PD00066 13.92 2.8006-14 383-396 
PD00066 13.92 2.800e-14 41 1-424 
PD00066 13.92 7.000e-14 355-368 
PD00066 13.92 8. 800e- 14 439-452 
PD00066 13.92 8.800e- 14 495-508 
PD00066 13.92 1.500e-13 551-564 
PD00066 13.92 7.000e-13 467-480 
PD00066 13.92 7.000e-13 523-536 
PD00066 13.92 9.5006-13 215-228 
PD00066 13.92 9.5006-13 243-256 
PD00066 13.92 9.500e-l3 579-592 
PD00066 13.92 8.615e-10 607-620 
PD00066 13.92 1.600e-09 187-200 


642 


BL00961 


Ribosomal protein S28e proteins. 


BL00961B 11,24 7.4296-37 67- 
100 BL00961A9.90 4.079e-26 
42-66 


643 


BL00585 


Ribosomal protein S5 proteins. 


BL00585A 28.43 1.391e-40 103- 
155 BL00585B 18.78 3.2506-30 

193-230 


647 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 9.400e-10 181-192 


648 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876C 6.15 9.229e-09 112- 
126 


652 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19,43 5.94 le-27 29-68 


653 


BL00047 


Histone H4 proteins. 


BL00047A 13.53 LOOOe-402-41 
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BL00047B6.51 l,429e-40 41-74 
BL00047C 12.18 1.310e-38 74- 
104 


654 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.109e-25 30-69 


655 


BL01115 


GTP-binding nuclear protein ran proteins. 


BLOl 1 15A 10.22 3.483e-17 19-63 


657 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.286e-10 31-40 


658 


BL00125 


Serine/threonine specific protein 
phosjjhatases proteins. 


BL00125B 21.48 l.OOOe-40 89- 
135 BL00125C 19.97 l.OOOe-40 
153-200 BL00125D33.il l.OOOe- 
40213-268 BL00125A 14.83 
8.94 le-38 47-84 


659 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 492-505 
PD00066 13.92 9.308e-15.380-393 
PD00066 13.92 6.000e-13 352-365 
PD00066 13.92 7.000e-13 240-253 
PD00066 13.92 7.500e-13 268-281 
PD00066 13.92 7.500e-13 408-421 
PD00066 13.92 2.174e-ll 464-477 
PD00066 13.92 l.OOOe-10 436-449 


660 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.189e-26 29-68 


661 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.882e-15 193- 
238 BL00795C 17.06 3.797e-13 
187-232 BL00795C 17.06 5.014e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-ll 185- 

230 BL00795C 17.06 2.000e- 11 
198-243 BL00795C 17.06 3.778e. 
11 171-216 BL00795C 17.06 
6.111e-ll 197-242 BL00795C 
17.06 6.444e-ll 194-239 
BL00795C 17.06 8.000e-ll 189- 
234 BL00795C 17.06 8.556e-ll 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 

231 BL00795C 17.06 6.965e-10 
190-235 BL00795C 17.06 2.700e- 
09 200-245 BL00795C 17.06 
5.800e-09 175-220 BL00795C 
17.06 6.500e-09 182-227 
BL00795C 17.06 6.600e-09 20 1 - 
246 BL00795C 17.06 6.600e-09 
202-247 BL00795C 17.06 6.600e- 
09 208-253 


662 


BL00469 


Nucleoside diphosphate kinases proteins. 


BL00469 22.22 l.OOOe-40 149-204 


663 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.41 le-11 331- 
385 


664 


BL00601 


Tryptophan pentad repeat proteins (IRF 
family) proteins. 


BL00601 A 20.29 5.500e-23 7-46 
BL00601B 20.92 3.631e-13 69-98 


665 


BL00082 


Extradiol ring-cleavage dioxygenases 
proteins. 


BL00082A 19.07 8.615e-12 49-72 


666 


DM01537 


kw SKI2 W SKI2 NUCLEOLAR 


DM01537B 21.63 4.073e-37 834- 
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HELICASE. 


881 DM01537B21.63 9.750e-21 
1669-1716 DM01537A 15.14 
8.650e-18 698-718 DM01537A 
15.14 6.766e-12 1537-1557 


667 


DM01537 


kw SKI2 W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 7.923e-38 820- 
867 DM01537B 21.63 9.7506-21 
1655-1702 DM01537A 15.14 
8.650e-l 8 684-704 DM01537A 
15.14 6.766e-12 1523-1543 


669 


BL00107 


Protein kinases ATP-binding region 

proteins. 


BL00107A 18.39 6.786e-24 849- 
880 BL00107B 13.31 6.727e-13 
916-932 


670 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84*9.735e-27 37-89 


671 


BL00027 


•Homeobox' domain proteins. 


BL00027 26.43 6.571 6-12 432-475 


676 


PR00861 


ALPHA-LYTIC ENDOPEPTIDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
221 


678 


BL00225 


Crystallins beta and gamma 'Greek key* 
motif proteins. 


BL00225B 18.06 7.517e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8.200e-l 9 175-210 
BL00225B 18.06 8.200e-19 1698- 
1733 BL00225B 18.06 4.808e-14 
73-108 BL00225B 18.06 4,808e- 
14 1596-1631 BL00225B 18.06 
5.500e-14 2077-21 12 BL00225A 
13.82 5.829e-12 2043-2064 
BL00225A 13.823.127e-09 1759- 
1780 


679 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 4.240e-10 169- 
184 PR00320A 16.74 6.294e- 10 
169-184 


680 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL002431 31.77 1.143e-ll 172- 
215 


681 


PR00852 


XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


PR00852H5.90 l.OOOe-29 612- 
635 PR00852E 8.14 3.769e-27 
348-371 PR00852D 11.38 8.875e- 
27 309-331 PR00852B 11.08 
2.800e-25 249-269 PR00852I 
17.26 3.500e-25 683-704 
PR00852F 1 1 .85 5.909e-24 379- 
398 PR00852G 16.19 4.462e-23 
468-486 PR00852C8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 1.375e-35 15-63 


685 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 7.5 OOe-20 40-58 
BL00972D 22.55 3.903e-16 300- 
325 BL00972B 9.45 l.OOOc-13 
120-130 BL00972E 20.72 5.500e- 
11 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


688 


BL00388 


Proteasome A-type subunits proteins. 


BL00388A 23,14 l.OOOe-40 8-54 
BL00388B 31.38 3.864e-33 66- 
108 BL00388D 20.71 l.OOOe-21 
153-184 BL00388C18.79 8.147e- 
16 126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 l,105e-15 347- 
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TRAN. 


394 


691 


PD01572 


PHOTOSYSTEM II REACTION 
CENTRE T PROTEIN PHOTOS. 


PD01572 8.774.083e-09 1-31 


692 


BL00028 


Zinc finger, C2GQ type, domain proteins. 


BL00028 16.07 7.600e-10 488-505 


694 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013A 25.14 9.357e-33 527- 
563 BL01013D 26.81 8.235e-23 
814-858 BL01013C9.97 6.211e- 
14 615-625 BL01013B 11.33 
3.605e-13 592-603 


695 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.973.571e-13 164-178 
PD00289 9.97 8.6506-11 2147- 
2161 PD002899.972.552e.0923- 

37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e-09282- 
302 


700 


PR00749 


LYSOZYME G SIGNATURE 


PR00749F 13.63 8.636e-13 139- 
156 PR00749H8.22 3.681e-12 
173-194 PR00749B 16.54 1.419e- 
1 1 48-70 PR00749C 7.26 3.060e- 
1172-91 PR00749A 10.33 
4.8156-10 24-45 


703 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR007041 9.52 l.OOOe-29 476-505 
PR00704D 1 1.05 2.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1.237e-21 317-339 PR00704H 
13.38 8.138e-21 367-385 
PR00704A 14.68 2.125e-19 27-51 
PR00704C 11.88 1.257e-17 96- 
113 PR00704B 17.94 1.833e-15 
72-95 


705 


PR00859 


PROKARYOTE METALLOTHIONEIN 
SIGNATURE 


PR00859C 7.06 2.776e-09 94-1 1 1 


706 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 9.581e-26 369- 
416 BL00226B 23.86 3.250e-24 
203-251 BL00226C 13.23 8.269e- 
21 268-299 BL00226A 12.77 
8.200e-14 103-118 


707 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 2.440e-102-15 


708 


BL00361 


Ribosomal protein SIO proteins. 


BL00361B 18.34 5.101e-10 82- 
105 


709 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 2.2006-10 2-15 


710 


BL00514 


Fibrinogen beta and ganuna chains C- 
terminal domain protems. 


BL00514C 17.41 8.412e-27 160- 
197 BL00514E 14.28 8.909e-16 
219-236 BL00514H 14.95 1.551e- 
15 317-342 BL00514G 15.98 
7.7506-15 284-314 BL00514D 
15.35 4.7896-10201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 8.714e-12 49-90 


714 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24.53 6.029e-17 158- 
202 BL00400D 2326 2.0806-14 
222-259 BL00400A 21.59 1.6006- 
10 27-59 


715 


BL01154 


RNA polymerases L / 13 to 16 Kd 


BLOl 1 54B 24.55 5.500e-36 40-76 
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subunits proteins. 


BL0I154A 18.703.000e-22 19-40 


716 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.7866-32 10-49 


717 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.206e.l4 77- 
102 BL00215A 15.82 8.412e-10 
175-200 


719 


BL00309 


Vertebrate galactoside-binding lectin 
proteins. 


BL00309C 18.65 2.241e-09 62-87 


726 


BL00687 


Aldehyde dehydrogenases glutamic acid 
proteins. 


BL00687E 25.37 7.136e-33 266- 
316 BL00687D 26.00 5.333e-28 
151-198 BL00687B 17.54 3.647e- 
26 39-81 BLO0687C 24.13 
6.087e-22 96-133 BL00687F 9.55 
2.500e-ll 352-363 


727 


DM01354 


kw TRANSCRIPTASE REVERSE U 
0RF2. 


DM01354N 13.17 LOOOe-40 129- 
174 DM01354O8,73 6.605e-15 
180-226 


734 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e-09 101- 
112 


735 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024A 10.26 l.OOOe-40 22-69 
BL01024B 8.91 1 .OOOe-40 86-127 
BL01024C 7.80 l.OOOe-40 146- 
185 BL01024D 13,22 l.OOOe-40 
185-222 BL01024E 11.96 l.OOOe- 
40 222-266 BL01024F9.42 
1 .OOOe-40 266-3 1 7 BLO 1 024G 
11.09 l.OOOe-40 3 17-349 
BL01024H 13,88 l.OOOe-40 389- 
442 


736 


PF00913 


Trypanosome variant surface 
glycoprotein. 


PF00913D 11.90 7.1 30e- 10 24-51 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2.200e-09 82- 
101 


740 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 1.600e-09 68-83 
PR00320A 16.74 7.366e-09 68-83 


743 


PR00871 


DNA 

NUCLEOTIDYLEXOTRANSFERASE 
(TDT) SIGNATURE 


PR00871G 14.48 8.000e-09 178- 
201 


745 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 2.286e-10 33-42 


749 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.200e-15 221- 
246 BL00215A 15.82 7.618e-14 
20-45 BL00215A 15.82 8.851e-ll 
123-148 BL00215B 10.44 9.526e- 
11 69-82 BL00215B 10.44 
7.300e-09 272-285 BL00215B 
10.44 8.500e-O9 165-178 


751 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 l.OOOe-14 370- 
389 BL50002B 15.18 2.200e- 10 
408-422 


752 


BL00353 


HMGl/2 proteins. 


BL00353B 11.47 3.089e-12 390- 

440 


753 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 4214e-14 47-69 


754 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 8.94 le- 10 66-78 


755 


PR00926 


MITOCHONDRLAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 7.750e-19 392- 
415 PR00926C 16.07 5.935e-17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PR00926E 11.70 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.971e-15 344-363 PR00926B 
16.07 9.526e-13 210-225 
PR00926A 10.41 1.514e-12 197- 
211 


756 


BL01187 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


BLOl 1 87A 9.98 2.125e-12 324- 
336 BLOl 187A 9.98 4.789e-ll 
377-389 BL01187B 12.04 3.057e- 
10 439-455 


757 


PF00651 


BTB (also known as BR-C/Ttk) domain 
.proteins. 


PF00651 15.00 4.429e-10 43-56 


758 


PR00055 


HIV TAT DOMAIN SIGNATURE 


PR00055A 8.13 8.855e-09 144- 
156 


759 


PD00P66 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.304e-ll 110-123 


760 


PR00448 


NSF ATTACHMENT PROTEIN 
SIGNATURE^ 


PR00448D 12.42 3.455e-27 162- 
186 PR00448A 10.74 1273e-22 
37-57 PR00448B 16.01 9.379e.21 
100-118 PR00448C 11.46 l.OOOe- 

20 129-147 


765 


BL01042 


Homoserine dehydrogenase proteins. 


BL01042A 13.29 5.909e-l 1 74-95 


766 


PR00625 


DNA J PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 2.154e-18 26-46 
PR00625B 13.48 9.000e-16 57-78 


768 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 8.500e-28 1 12- 
149 BL00762B 16.14 3.793e-12 
64-78 BL00762A 23.43 6.625e-12 
6-43 BL00762C 15.58 4.176e-09 
459-472 BL00762D 11.15 9.667e- 
09 210-220 


769 


PR00709 


AVIDIN SIGNATURE 


PR00709A 4.60 1.934e-09 1-20 


no 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 1.720e-10262- 
277 PR00320A 16.74 2.853e-10 
262-277 PR00320C 13.01 4.300e- 
09 96-111 PR00320B 12.19 
5.500e-09 262-277 PR00320A . 
16.74 6.268e-09 55-70 


111 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 8.714e-12 87- 
101 PR00019A 11.19 l.OOOe-10 

90-104 


772 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 110- 
159 


773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 155- 
204 


774 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547F 23.43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B 11.28 
1. 81 8e-l 8 518-532 DM00547C 
17.30 3.531e-17 546-568 
DM00547A 12.38 1.273e-ll 497- 
509 DM00547D11.60 9.200e-ll 
622-636 


776 


PR00779 


INOSITOL 1,4,5-TRlSPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 769- 
792 


111 


PR00779 


INOSITOL 1,4,5-TRJSPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 


lis 


PR00779 


INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09742- 
765 
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779 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.543e-09 6-45 


781 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 3.118e-ll 654- 
672 PR00205B 11.39 8.588e.ll 
230-248 PR00205B 11.39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 


783 


BL00625 


Regulator of chromosome condensation 
(RCCl) proteins. 


BL00625B 17.69 2.167e.I9 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 1.885e- 
16 140-174 BL00625B 17.69 
2.770e-16 245-279 BL00625A 
16.21 9.1 15e-16 251-280 
BL00625A 16.21 6.507e-14 146- 
175 


785 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6,400e-09 656-668 


786 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6.400e-09 656-668 


787 


BL00826 


MARCKS family proteins. 


BL00826C7.63 6.73 8e-09 203- 
230 


788 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.310e-14 36-54 
PR00453B 14.65 8.568e- 10 75-90 


789 


PR00102 


ORNITHINE 

CARBAMOYLTRANSFERASE 
SIGNATURE 


PR00102B 14.82 5.418e-09 963- 
977 


790 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030B7.03 5.500e-ll 199- 
209 


791 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 393- 
437 BL00415N 4.29 2.1 17e-09 
103-147 BL00415N4.29 3.628e- 
09 97-141 BL00415N4.29 
5.6646-09 387-431 


795 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.091e-36 105-144 


799 


PF00731 


AIR carboxylase. 


PF00731C 23.16 7.333e-35 337- 
380 PF00731B 19.47 7.429e-28 
299-336 PF0073IA 19.32 6.333e-. 
24 268-297 


804 


BL00170 


Cyclophilin-type peptidyl-prolyl cis-trans 
isomerase signatur. 


BL00170B 20.97 8.071e-09 297- 
337 


805 


BL00678 


Tip- Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 378-389 
BL00678 9.67 5.800e-l 0 418-429 
BL00678 9.67 8.800e-10 295-306 


806 


PD01719 


PRECURSOR GLYCOPROTEE^ 
SIGNAL RE. 


PD01719A 12.89 7,571e-14290- 
318 


807 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 9.100e-09451- 
466 


809 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.462e-12 564- 

595 


810 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.3 lOe- 14 36-54 
PR00453B 14.65 8.568e- 10 75-90 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


815 


PD0I066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-3I 16-55 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.919e-18 
179-208 


818 


PR00830 


ENDOPEPTIDASE LA (LON) SERINE 


PR00830A 8.41 9.571e-ll 115- 
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PROTEASE (S16) SIGNATURE 


135 


819 


BL00126 


S'S'-cyclic nucleotide phosphodiesterases 
proteins. 


BL00126C 22.07 7.857e-24 528- 
569 BL00126E35223.714e-15 
669-724 BL00126D 25.50 1.173e- 
14 584-623 BL00126B 15.20 
l.OOOe-12 502-514 BL00126A 
27.56 3.361e-09 461-498 


820 


PR00511 


TEKTIN SIGNATURE 


PR00511B 12.25 8.826e-22 174- 
195 PR00511A 13.59 7.723e-ir 
155-172 


821 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 2.800e-15 13-36 


822 


PF0p78O 


Domain found in NIKl-like kinases, 
mouse citron and yeast ROM. 


PF007801 14.69 4.825e-09 231- 
261 


827 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 5.235e-ll 144- 
163 


828 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9.357e-l 1 545- 
586 


829 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A9.37 l.OOOe-40 46-85 
PD02448B 10.17 l.OOOe-40 85- 
133 PD02448C 13.62 l.OOOe-40 
152-189 PD02448E 11.33 9.000e- 
30235-261 PD02448F 14.22 
9.654e-25 279-303 PD02448D 
11.48 3.659e-l 8 197-211 
PD02448G 10.73 7.857e-16 305- 
318 


830 


BL00720 


Guanine-nucleotide dissociation 

stimulators CDC25 family sign. 


BL00720B 16.57 4.500e-23 483- 

507 


831 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 6.625e-21 143- 
174 BL00107B 13.31 4.214e-10 
213-229 


832 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.787e-n 32-57 


833 


PR00497 


NEUTROPHIL CYTOSOL FACTOR 
P40 SIGNATURE 


PR00497A 6.92 4.375e-09 41-59 


834 


BL00229 


Tau and MAP proteins tubulin-binding 
domain proteins. 


BL00229A 23.57 9.565e-10 99- 
138 


835 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.97 2.216e-09 1053- 
1083 


836 


BL00795 


Involucrin proteins. 


BL00795B 12.41 7.93 le-09 405- 
445 


837 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 l.OOOe-17 34-53 
PR00020B 15.52 5.8466-1668-85 
PR00020D 12.70 2.543e-15 147- 
162 PR00020C 13.66 3.483e-13 
95-107 PR00020E8.64 6.586e-13 
165-179 


838 


BL50017 


Death domain proteins profile. 


BL50017B 17.60 6.897e-13 1499- 
1515 


839 


PF00850 


Histone deacetylase family. 


PF00850C 14.55 9.542e-09 1352- 
1369 


840 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 4.500e- 12 44-60 
PF00023B 14.20 7.923e-ll 73-83 
PF00023B 14.20 9.000e-10 139- 
149 PF00023B 14.20 5.500e-09 
40-50 


842 


BL01194 


Ribosomal protein L15e proteins. 


BL01194B 13.66 1.0006-40 37-85 
BL01194C 12.35 9.250e-40 103- 
138 BL01194A 18.70 7.6326-38 
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2-37 BL01194D 19.02 2.658e-36 
139-178 


843 


BL00610 


Sodium :neurotransmitter syroporter 
family proteins. 


BL00610A 17.73 LOOOe-40 40-90 
BL00610B 23.65 l.OOOe^O 104- 
154 BL00610C 12.94 l.OOOe-40 
206-258 BL00610E 20.34 l.OOOe- 
40 355-398 BL00610F 29.02 
l.OOOe-40 454-509 BL00610D 
20.97 6,063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 
537 


845 


BL00143 


Insulinase family, zinc-binding region 
proteins. 


BL00143A 20.91 4.300e-20 94- 
121 BL00143C 14.16 5.500e-13 
245-258 BL00143B 14.41 9.053e- 
10 14M56 


846 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


847 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


848 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824C 14.58 1.0006-40 129- 
167 BL00824D 14.04 6.192e-39 
167-202 BL00824B 9.21 2.080e- 
21 96-116 BL00824E 12.49 
3.3336-19210-226 BL00824A 
13.78 8.650e-14 19-34 


849 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 l.OOOe-40 12-51 


850 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.316e-24 10-49 


. 852 


BL01272 


Glucokinase regulatory protein family 
proteins. 


BL01272B 19.61 6.870e-30 136- 
171 BL01272C 11.68 3.314e-25 

249-274 BL01272A 6.49 1.231e- 
18 99-117 


853 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.341e-20 65- 
106 


854 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 6.850e-ll 140-154 


858 


PR00450 
• 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 3.250e-25 68-90 
PR00450B 1 1.76 8.125e-23 22-42 
PR00450D 16.58 8.920e-22 92- 
112 PR00450E 12.14 1.581e-19 
114-133 PR00450G 15.33 5.500e- 
19 166-187 PR00450F 12.30 
4.375e-15 140-156 PR00450A 
13.58 1.857e-14 8-23 


860 


BL00027 


*Homeobox* domain proteins. 


BL00027 26.43 7.188e-27 74-1 17 


866 


BL00477 


Alpha-2-macroglobulin family thiolester 
region proteins. 


BL00477L 23.51 7.480e-20 54-87 


867 


BL01078 


Molybdenum cofactor biosynthesis 
proteins. 


BL01078B 14.20 1.621e-20408- 
429 BL01078A 10.16 2.000e-13 
366-379 BL01078D5.99 3.455e- 
11 566-576 BL01078C 10.52 
3.793e-ll 501-513 


868 


BL01177 


Anaphylatoxin domain proteins. 


BLOl 177E 20.64 5.800e-24 462- 
489 BLOl 177C 17.39 5.333e-19 
416-435 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 441-459 


869 


BL01177 


Anaphylatoxin domain proteins. 


BLOl 177E 20.64 5.800e-24 415- 
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442 BL01177C 17.39 5.333e-I9 
369-388 BL01177B 13.61 7,840e- 
16 122-138 BL01177D 17.50 
1.900e- 15 394-412 


871 


BL50007 


Phosphatidylinositol-specific 
phospholipase X-box domain proteins 
prof. 


BL50007A 19.61 l.OOOe-40 322- 
368 BL50007D 19.54 l.OOOe-40 
589-63 1 BL50007B 20.90 6.700e- 
36 383-421 BL50007E 25.63 
9.053e-33 748-785 BL50007C 
8.97 5.200e- 19 452-469 


872 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 3.250e-17 90- 
115 


874 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 1 1.65 4.250e-09 370- 
386 


877 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 5.500e-13 1343- 
1366 


878 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.5256-09 52-85 


881 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807E 10.90 4,702e-09 358- 
407 


882 


. PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.188e-37 8-47 


885 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 8.071e-09 10-26 


886 


PR00372 


BIOPTERIN-DEPENDENT 
AROMATIC AMINO ACID 
HYDROXYLASE SIGNATURE 


PR00372B 10.30 9.308e-27 225- 
248 PR00372A 13.39 7.000e-24 
134-154 PR00372E 12.62 2.125e- 
23 360-380 PR00372C 7.90 
3,025e-22 289-309 PR00372F 
13.09 6.333e-21 395-414 
PR00372D 10.22 l.OOOe-19 329- 
348 


887 


BL00301 


GTP-binding elongation factors proteins. 


BL00301B 20.09 2.800e-24 103- 
135 BL00301A 12.41 4.316e-13 
21-33 


888 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 1.667e-09 30-39 


889 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.906e-26 6-45 


890 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 7.652e-09 113- 
123 


892 


BL01022 


PTR2 family proton/oligopeptide 
syraporters proteins. 


BL01022B 22.19 6.016e-14 72- 
118 BL01022E 23.51 1.173e-12 
472-508 BL01022A 11.58 9.135e- 
12 42-61 BL01022D9.42 3.455e- 
11 199-212 


893 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 

383 


894 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


895 


PR00237 


RHODOPSIN-LKE GPCR 
SUPERFAMIL Y SIGNATURE 


PR00237B 13.50 9.100e-14 116- 
138 PR00237F 13.57 1.360e-13 
312-337 PR00237G 19.63 9.069e- 
13 353-380 PR00237E 13.03 
7.120e-12 243-267 PR00237D 
8.94 4.150e-ll 194-216 
PR00237A 11.484.375e-ll 83- 
108 


896 


BL00129 


Glycosyl hydrolases family 3 1 proteins. 


BL00129D 16.76 8.258e-26 634- 
678 BL00129A 26.21 1.720e-25 
384-430 BL00129E 22.60 4.857e- 
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23 698-734 BL00129C 15.12 
1.750e-22 596-624 BL00129B 
19.19 5.891e- 18 495-522 
BL00129F 26.19 7.545e.l5 814- 
852 


897 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PDOllOl 


INHIBITOR HEAVY CHAIN 
CHANNEL IN. 


PDOUOIB 21.53 l.OOOe-40274- 
327 PDOIIOID 24.45 l.OOOe-40 
457-512 PDOJ 101 A 18.25 6.268e- 
23 83-117 PDOIIOIC 12.69 
1.2376-16 366-386 PDOllOlE 
6.73 7.750e- 12 566-576 


900 


PR00600 


PROTEIN PHOSPHATASE PP2A 55KD 
REGULATORY SUBUNIT 
SIGNATURE 


PR00600A 11.61 5.979e-09 31-52 


901 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.116e-31 24-63 


903 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL011I5A 10.22 1.509e-l 1 21-65 


906 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2,929e-10 548-581 DM00215 
19.43 4.0546-10 550-583 
DM00215 19.43 5.3396-10 552- 
585 DM00215 19.43 7.107e-10 
544-577 


907 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.276e-12 3 14- 

332 


908 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1 125- 
1156 


909 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1118- 
1149 


910 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


911 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-ll 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e-13 197-212 


914 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 l.OOOe-27 435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962B 11.98 9.122e- 
26296-3I9 PR00962A 13.28 
6.1436-22 15-34 PR00962C8.00 
4.0006-21 348-369 PR00962F 
12.39 9.769e-21 552-572 
PR00962H 13.32 2,636e-20 623- 
643 PR009621 11.68 9.786e-20 
692-712 PR00962E 8.81 2.91 5e- 
18 515-534 


915 


PR00962 


LETHAL(2) GDVNT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 l.OOOe-27 365- 
389 PR00962G 15.71 4.086e-26 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C8.004.OOOe- 
21 278-299 PR00962F 12.39 
9.7696-21 482-502 PR00962H 
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13.32 2.636e-20 553-573 
PR009621 1 1.68 9.786e-20 622- 
642 PR00962E 8.81 2.9156-18 
445-464 


916 


BL00134 


Serine proteases, trypsin family, histidine 
proteins. 


BL00134A 1 1.96 5.886e-14 90- 
107 


917 


BL00478 


LIM domain proteins. 


BL00478B 14.79 8.393e-13 211- 
226 BL00478B 14.79 6.7 12e- 10 
271-286 


918 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.729e-09 973- 
988 


922 


BL00150 


Acylphosphatase proteins. 


BL00150 25.33 l.OOOe-40 37-84 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM0003 IB 15.41 8.063e-0979- 
113 


925 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072D 30.08 2.837e-24 280- 
331 BL00072E 24.12 8.2006-24 
368-411 BL00072C 25.30 7.873e- 
20 226-267 BL00072B 9.48 
6.0496-12 183-196 


927 


BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 1.692e-13 229- 
256 BL00237A 27.68 6.657e-13 
90-130 BL00237D 11.23 9.571e- 
13 290-307 


928 


BL01033 


Globins profile. 


BL01033A 16.94 7.923e-l 8 25-47 
BL01033B 13.81 l.OOOe-15 93- 
105 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.714e-13 203- 
253 


932 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 353- 
397 BL00415N4.29 2.117e-09 
63-107 BL00415N4.29 3.628e-09 
57-101 BL00415N 4.29 5.664e-09 
347-391 


933 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1 .OOOe-40 46-85 
PD02448B 10.17 l.OOOe-40 85- 
133 PD02448C 13.62 l.OOOe-40 
152-189 PD02448E 11.33 9.000e- 
30 223-249 PD02448F 14.22 
9.6546-25 267-291 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16293- 
306 


934 


DM00191 


w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBICIN. 


DM00191D 13.94 9.0836-10 136- 
175 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 4.696e-10 67- 
111 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.1386-14 865- 
895 


937 


PR00762 


CHLORIDE CHANNEL SIGNATURE 


PR00762A 14.22 4.0006-22 183- 
201 PR00762C9.29 l.OOOe-21 
268-288 PR00762E 12.07 3.250e- 
20 520-537 PR00762D 11.29 
1 .0006-1 9 470-49 1 PR00762F 
15.12 1.4296-19 538-558 
PR00762B 12.12 1.818e-18214- 
234 PR00762G 14.13 3.455e- 17 
577-592 


938 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.500e-25 291-334 


939 


DM01111 


4 kw PHOSPHATASE 


DM01 1 1 IE 17.28 1 .568e-10 248- 
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TRANSFORMING 61K PDFl. 


297 DM01 11 IE 17.28 5.168e-10 
659-708 DM01 11 ID 16.76 
5,263e-09 279-325 DM01 lllM 
10.67 8.674e-09 91 1-935 


940 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107B 13.31 l.OOOe-14 293- 
309 BL00107A 18.39 6.760e-13 
229-260 


942 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e-ll 543- 

597 


943 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 3.500e-35 8-47 


945 


BL00989 


Clathrin adaptor complexes small chain 
proteins. 


BL00989B 26.51 l.OOOe-40 66- 
117 BL00989A 11.66 l.OOOe-13 
5-19 


946 


PR00178 


FATTY ACID-BINDING PROTEIN 
SIGNATURE 


PR00178D 13.52 9.571e-09450- 
469 


947 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 4.857e-09713- 
724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.412e-14 201-216 


951 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.050e-10 180- 
230 


952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4.300e-ll 26-49 
PR00926F 17.75 6.348e-09 134- 
157 


955 


PF00109 


Beta-ketoacyl synthase. 


PF00109 13.08 2.846e-12 342-357 


957 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069A 16.01 8.826e-24 26-5 1 
PR00069B 11.33 1.514e-17 86- 
105 PR00069C 16.03 8.816e-14 

155-173 


958 


PF00583 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- 
642 


961 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328A 10.62 8.740e-10 7-31 


962 


BL00354 


HMG-I and HMG-Y DNA-bindmg 
domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


963 


BL00354 


HMG-I and HMG-Y DNA-binding 
domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


964 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.I88e.27 53-96 


965 


PF00992 


Troponin. 


PF00992A 16.67 2.421e-09 581- 

616 


966 


PRO0515 


5-HYDROXYTRYPTAMINE IF 
RECEPTOR SIGNATURE 


PR00515D7.91 5,741e-09 13-33 


967 


BL00579 


Ribosomal protein L29 proteins. 


BL00579B 21.99 5.065e-21 164- 
194 


970 


BL00504 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 
proteins. 


BL00504C 18.68 2.227e-24 34-59 
BL00504D 10.43 7261e-21 75-93 


973 


PF00580 


UvrD/REPhelicase. 


PF00580A 13.37 4.720e-09 249- 
271 


974 


PR00456 


RIBOSOMAL PROTEIN P2 

SIGNATURE 


PR00456F5.86 l.OOOe-10 242-254 


975 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.429e-22 99- 
139 


976 


BL00031 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 7.158e-33 60-93 
BL00031B 22.25 5.500e-28 94- 
126 


977 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 8.200e-16 196-209 
PD00066 13.92 8.200e-l 6 336-349 
>D00066 13.92 2.385e-15 476-489 
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PD00066 13.92 9.308e-15 252-265 
PD00066 13.92 2.800e-14 448-461 
PD00066 13.92 4.600e-14 392-405 
PD00066 13.92 5.200e-14 280-293 
PD00066 13.92 4.0006-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571e-12 420-433 
PD00066 13.92 6.870e-ll 168-181 


978 


BL00721 


.Fonnate~tetrahydrofolate ligase proteins. 


BL00721B 13.21 l.OOOe-40 346- 
401 BL00721D 13.90 LOOOe-40 
538-592 BL00721E 13.46 l.OOOe- 
40 597-646 BL0072 11 18.79 
2.500e-40 814-860 BL00721H 
21.20 8.2396-39 763-814 
BL00721A 15.31 9,719e-32 287- 
321 BL00721C 16.92 4.000e-30 
498-535 BL00721F 15.96 8232e- 
27 660-702 BL0072rG7.97 
3.017e-10 721-734 


981 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 2.552e-09 180- 
201 


982 


BL00869 


Renal dipeptidase proteins. 


BL00869C 12.58 3.172e-19 59-95 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 
16 219-242 BL00869G 13.55 
2.543e-16 192-214 BL00869F 
12.77 7.03 le-14 157-192 
BL008691 12.92 3.2746-12 242- 
270 BL00869D 14.02 5.282er 10 
95-124 BL00869B 15.55 9.382e- 
10 31-61 


983 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196F 13.89 2.125e-09 92-108 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


BL00485D 30.82 2.427e-10 154- 
209 



* Results include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 



TABLE 4 



SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 


ig 


Immunoglobulin domain 


3.9e-17 


60.3 


3 


HSP90 


Hsp90 protein 


0 


1548.4 


6 


tsp_l 


Thrombospondin type 1 domain 


0.002 


22.1 


7 


7tm__l 


7 transmembrane receptor (rhodopsin 
family) 


6.7e-08 


27.3 


9 


PWWP 


PWWP domain 


8.1e-16 


66.0 


12 


Clq 


Clq domain 


1.7e-26 


101.5 


13 


Clq 


Clq domain 


2e-20 


81.3 


14 


Aa^trans 


Transmembrane amino acid 
transporter protein 


2.7e-42 


153.9 


15 


El-E2_ATPase 


E1-E2 ATPase 


6.3e-124 


412.2 


16 


trypsin 


Trypsin 


1.2e-87 


278.6 


17 


ig 


Immunoglobulin domain 


7.6e-12 


43.2 


18 


lectin_c 


Lectin C-type domain 


0.0003 


21.2 


20 


Alpha_L_fucos 


Alpha-L-fucosidase 


1.2e-217 


736.5 
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22 


pkinase 


Eukaryotic protein kinase domain 


3.3e-87 


303.1 


23 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


24 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


25 


ank 


Ank repeat 


5.5e-14 


59.9 


27 


pkinase 


Eukaryotic protein kinase domain 


1.5e-100 


347.4 


28 


spectrin 


Spectrin repeat 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domain, G-beta repeat 


1.2e-07 


38.8 


33 


mn 


RNA recognition motif. 


l.le-17 


72.2 


34 


mn 


RNA recognition motif. 


l.le-17 


72.2 


36 


7tm_l 


7 transmembrane receptor (ifaodopsin 

family) 


3e-36 


117.3 


37 


ank 


Ank repeat 


5.9e-25 


96.3 


38 


SRF-TF 


SRF-type transcription factor 


1.4e-36 


133.9 


40 


alkjhosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


zf-C2H2 


Zinc fmger, C2H2 type 


8.6e-103 


354.9 


45 


sugar tr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm_^2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-C2H2 


Zinc fmger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


1.2e-176 


600.3 


52 


zf-C3HC4 


Zinc fmger, C3HC4 type (RING 
finger) 


2.7e-10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin cytoplasmic region 


L9e-94 


327.2 


54 


S_100 


S-lOO/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol P 


Inositol monophosphatase family 


5e-13 


49.8 


59 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-46 


147.6 


60 


Kunitz^BPTI 


Kunitz/Bovine pancreatic trypsin 
inhibito 


3.7e-47 


148.6 


62 


DAD 


DAD family 


2.5e-74 


260.3 


63 


MOZ^^SAS 


MOZ/SAS family 


5.9e-133 


455.1 


64 


MOZ__SAS 


MOZ/SAS family 


L7e-123 


423.6 


65 


ras 


Ras family 


9.3e-89 


308.3 


67 


Hamlp like 


Haml family 


3.7e.49 


176.7 


68 


7tm__l 


7 transmembrane receptor (rhodopsin 
family) 


5.2e-39 


126.1 


70 


zf-C2H2 


Zinc fmger, C2H2 type 


1.5e-112 


387.3 


71 


Peptidase_M41 


Peptidase fiamily M41 


1.2e-110 


381.0 


72 


abhydrolase 


alpha/beta hydrolase fold 


9.8e-05 


26.5 


81 


Kjetra 


K+ channel tetramerisation domain 


0.022 


-16.8 


82 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


176.3 


84 


AAA 


ATPases associated with various 
cellular act 


1.3e-77 


271.3 


85 


homeobox 


Homeobox domain 


1.4e-28 


108.3 


87 


TGF-beta 


Transforming growth factor beta like 


6.7e.68 


210.2 


91 


niito_carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


95 


adenylatekinase 


Adenylate kinase 


l.le'.15 


60.0 


96 


ig 


Immunoglobulin domain 


4.1e-20 


69.8 


99 


CNH 


CNH domain 


3.4e.l20 


412.7 


100 


homeobox 


Homeobox domain 


7.4e-32 


119.3 


101 


zf-C2H2 


Zinc fmger, C2H2 type 


2.2e-47 


170.8 


102 


zf-C2H2 


Zinc fmger, C2H2 type 


4.4e-89 


309.4 


103 


dynamin 


Dynamin family 


1.4e-150 


513.6 


104 


lectin_c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 


97.9 
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112 


HSP20 


Hsp20/alpha crystallin family 


2.6e-20 


77.7 


115 


EF TS 


Elongation factor TS 


3.8e-63 


221.1 


116 


sugartr 


Sugar (and other) transporter 


4e-63 


223.1 


.118 


catalase 


Catalase 


0 


1158.9 


119 


UCH 


Ubiquitin carboxy 1-tenninal 
hydrolase, famil 


le-10 


24.4 


122 


metalthio 


Metallothionein 


2.8e-25 


97.4 


125 


adh short 


short chain dehydrogenase 


1.6e-45 


164.6 


126 


KRAB 


KRAB box 


7.9e-25 


95.9 


127 


G-alpha 


G-protein alpha subunit 


le-249 


843.0 


128 


mito carr 


Mitochondrial carrier proteins 


2e-65 


227.2 


131 


EFIBD 


EF-1 guanine nucleotide exchange 
domain . 


4.9e-53 


189.6 


132 


GYF 


GYF domain 


4.9e-28 


106.6 


133 


GYF 


GYF domain 


4.9e-28 


106.6 


134 


lipocalin 


Lipocalin / cytosolic fatty-acid 
binding pr 


2.1e-33 


119.1 


135 


pkinase 


Eukaryotic protein kinase domain 


3.3e-86 


299.8 


136 


ank 


Ank repeat 


2.2e-29 


111.1 


137 


IL8 


Small cytokines 
(intecrine/chemokine), inter 


3.1e.l8 


65.2 


139 


pyridoxal_deC 


Pyridoxal-dependent decarboxylase 
conse 


0.00011 


19.0 


140 


cadherin 


Cadherin domain 


1.3e-88 


307.8 


142 


efhand 


EF hand 


5.7e-33 


123.0 


143 


Acyltransferase 


Acyltransferase 


2e-29 


111.2 


146 


cytochrome^c 


Cytochrome c 


1.7e-33 


124.7 


147 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 


300.3 


148 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


L7e-09 


45.0 


149 


aldo ket red 


Aldo/keto reductase family 


7.4e-l 89 


640.8 


150 


homeobox 


Homeobox domain . 


3.2e-08 


38.7 


151 


PseudoU synth 
1 


tRNA pseudouridine synthase 


4.7e-57 


203.0 


152 


abhydrolase 


alpha/beta hydrolase fold 


1.7e-31 


118.0 


153 


PDZ 


PDZ domain (Also known as DHR or 

GLGF). 


l.le-09 


45.6 


156 


PHD 


PHD-fmger 


7.6e-15 


62.8 


157 


fii3 


Fibronectin type III domain 


0.015 


21.9 


158 


homeobox 


Homeobox domain 


2.7e-27 


104.1 


160 


PWI 


PWI domain 


3.9e-24 


93.6 


162 


DnaJ 


DnaJ domain 


2e-06 


34.8 


164 


Cbl_N 


CBL proto-oncogene N-terminal 
domain 


8e-117 


401.5 


166 


metalthio 


Metallothionein 


3.1e-26 


100.6 


167 


LRR 


Leucine Rich Repeat 


0.00069 


26.3 


169 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-tenn 


5.36-180 


611.4 


170 


fibrmogen_C 


Fibrinogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 


171 


fibrinogen^C 


Fibrinogen beta and gamma chains, 
C-term 


le-149 


510.8 


173 


homeobox 


Homeobox domain 


1.5e-29 


111.6 


174 


FYVE 


FYVE zinc finger 


7.4e-28 


103.8 


175 


GRIP 


GRIP domain 


3.9e-08 


40.5 


182 


pkinase 


Eukaryotic protein kinase domain 


3.4e-71 


250.0 


185 


CAP GLY 


CAP-Gly domain 


5.6e-51 


182.8 


186 


TBC 


TBC domain 


2.2e-50 


180.8 


187 


TEC 


TBC domain 


2.2e-50 


180.8 
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18S 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


4e-13 


57.0 


189 


Kelch 


Kelch motif 


5.2e-106 


365.6 


190 


Tropomyosin 


Tropomyosins 


3.8e-171 


535.4 


192 


Rieske 


Rieske [2Fe-2S] domain 


0.0016 


18.5 


199 


ig 


Immunoglobulin domain 


5.9e-19 


66.1 


202 


EGF 


EGF-like domain 


3.4e-54 


193.5 


203 • 


trefoil 


Trefoil (P-type) domain 


le-24 


95.5 


204 


TBC 


TBC domain 


8.5e-38 


139.0 


205 


efhand 


EF hand 


0.0096 


22.6 


206 


ISK^Chaimel 


Slow voltage-gated potassium 
channel 


0.0031 


8.1 


207 


trefoil 


Trefoil (P-type) domam 


2.9e-48 


173.7 


209 


Ribosomal_S13 


Ribosomal protein S13/S18 


1.2e-78 


274.7 


210 


hemopexin 


Hemopexm 


1.3e-62 


221.5 


213 


TBC ^ 


TBC domain 


2.5e-48 


174.0 


215 


Basic 


Myogenic Basic domain 


43Q-50 


179.8 


216 


Ribosomal L24 


KOW motif 


8:2e-23 


89.2 


222 


&3 


Fibronectin type III domain 


7.3e-141 


481.4 


223 


cofilin_ADF 


Cofilin/tropomyosm-type actin- 
binding pr 


9.3e-47 


168.8 


224 


efhand 


EF hand 


6.1e-06 


33.2 


225 


Pterin__4a 


Rerin 4 alpha carbinolamine 
dehydratase 


9.3e-42 


152.1 


228 


ABC tran 


ABC transporter 


4.1e-110 


379.2 


234 


El_DerP2_^DerF 
2 


El family 


3.7e-90 


312.9 


235 


El DerP2 DerF 

2 


El family 


1.6e-48 


174.6 


237 


PMP22_Claudin 


PMP-22/EMP/MP20/Claudin family 


1.7e-25 


98.1 


238 


Opiods_neurope 
P 


Vertebrate endogenous opioids 

neurope 


1.8e-159 


543.2 


239 


eIF-5a 


Eukaryotic initiation factor 5A 

hypusine 


5.9e-104 


358.8 


240 


Amino oxidase 


Flavin containing amine oxidase 


2.5e-ll 


37.8 


243 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-99 


343.6 


244 


Band_7 


SPFH domain / Band 7 family 


2.3e-53 


190.7 


245 


ank 


Ank repeat 


L6e-88 


307.5 


246 


zf-.C2H2 


Zinc finger, C2H2 type 


6.7e-49 


175.9 


247 


actin 


Actin 


2.3e-42 


140.3 


248 


ERJumen_recep 
t 


£R lumen protein retammg receptor 


2.4e-155 


529.5 


250 


PMP22_Claudin 


PMP-22/EMP/MP20/Claudin family 


2.2e-38 


140.9 


252 


Collagen 


Collagen triple helfac repeat (20 
copies) 


1.4e-13 


58.6 


255 


C2 


C2 domain 


0.052 


7.8 


257 


CAP GLY 


CAP-Gly domain 


1.4e-20 


81.8 


260 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


261 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


262 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


263 


cofilin^ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


7.8e-21 


82.6 


264 


Ribosomal_L14 


Ribosomal protein L14p/L23e 


9.2e-10 


40.6 


265 


SAPA 


Saposin A-type domain 


4.4e-27 


103.4 


266 


SAPA 


Saposm A-type domain 


4.4e-27 


103.4 


267 


ABC__tran 


ABC transporter 


9.5e-39 


14z.z 


269 


Ribosomal_L14 


Ribosomal protein L14p/L23e 


6.2e-62 


219.2 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 


-3.3 


272 


ras 


Ras family 


4.3e-87 


302.8 
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273 


rrm 


RNA recognition motif. 


0.074 


14.6 


275 


lipocalin 


Lipocalin / cytosolic fatty-acid 
bindingpr 


2.5e-41 


146.4 


276 


ras 


Ras family 


l.le-67 


238.3 


277 


UCH 


Ubiquitin carboxyl-tenminal 
hydrolase, fiamil 


1.2e-147 


503.9 


278 


START 


START domain 


3.2e-09 


44.1 


279 


WD40 


WD domain, G-beta repeat 


1.8e-27 


104.7 


282 


G-patch 


G-patch domam 


7.8e-22 


86.0 


287 


Antij)roliferat 


BTGl family 


1.2e-101 


351.0 


289 


KRAB 


KRAB box 


7.1e-21 


82.8 


293 


7tm_3 


7 transmembrane receptor 


3.3e-73 


256.6 


295 


SET 


SET domain 


5e-30 


113.2 


296 


Pyridox_oxidase 


Pyridoxamine 5'-phosphate oxidase 


1.3e-76 


268.0 


297 


rrm 


RNA recognition motif. 


5.4e-45 


162.9 


298 


Ubie_methyltran 


ubiE/C0Q5 methyltransferase family 


6.3e-05 


-96.3 


299 


Ubie_methyltran 


ubiE/C0Q5 methyltransferase family 


0.0024 


-118.1 


301 


Cytreductase 


FAD/NAD-binding Cytochrome 
reductase 


7.7e.61 


215.5 


302 


G-patch 


G-patch domain 


3.1e-14 


60.7 


307 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


7.7e-43 


138.2 


308 


PH 


PH domain 


0.0015 


17.8 


310 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


l,4e-84 


270.8 


311 


Rhodanese 


Rhodanese-like domain 


3.3e-64 


226.7 


312 


tubulin 


Tubulin/FtsZ family 


4.9e-286 


963.6 


314 


SIJRF4 


SURF4 family 


1.2e-199 


676.6 


325 


IMS 


impB/mucB/samB family 


2e-58 


207.5 


327 


cadherin 


Cadherin domain 


4.3e-91 


316.0 


329 


NAC 


NAC domain 


2.1e-28 


107.8 


330 


IP trans 


Phosphatidylinositol transfer protein 


6.5e-98 


338.7 


332 


TFIIS 


Transcription factor S-II (TFIIS) 


8.8e-05 


29.3 


337 


2f-C2H2 


Zinc fmger, C2H2 type 


3.6e-61 


216.6 


340 


AIRS 


AIR synthase related protein 


4e-32 


120.2 


343 


annexin 


Annexin 


4.6e-80 


279.4 


346 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


347 


Ribosomal L16 


Ribosomal protein LI 6 


4.6e.09 


34.9 


348 


lactamase B 


Metallo-beta-lactamase superfamily 


0.012 


-6.0 


351 


efhand 


EF hand 


2.5e-14 


61.0 


353 


lectin c 


Lectin C-type domain 


1.3e-05 


32.1 


354 


WD40 


WD domain, G-beta repeat 


2.2e-18 


74.5 


360 


lipocalin 


Lipocalin / cytosolic fatty-acid 
binding pr 


6.3e-10 


38.3 


362 


Acetyltransf 


Acetyltransferase (GNAT) family 


0.0019 


24.9 


365 


tRNA-synt_l 


tRNA synthetases class I (I, L, M and 
V) 


4.6e-185 


628.2 


366 


Sulfatase 


Sulfatase 


6.1e-228 


770.6 


368 


START 


START domain 


3.8e-ll 


50.5 


369 


pkinase 


Eukaryotic protein kinase domain 


2.4e-10 


41.3 


370 


ACBP 


Acyl CoA binding protein 


4.4e-56 


199.7 


371 


plcinase 


Eukaryotic protein kinase domain 


1.6e-94 


327.5 


373 


EGF 


EGF-like domain 


2.6e-12 


54.3 


375 


zf-C2H2 


Zinc finger, C2H2 type 


8.2e-64 


225.4 


377 


KRAB 


KRAB box 


3.7e-27 


103.7 


379 


SET 


SET domain 


7.3e-61. 


215.6 


380 


Glyco transf_8 


Glycosyl transferase family 8 


0.0028 


-40.1 


381 


2f-C2H2 


Zinc fmger, C2H2 type 


4.3e-06 


33.7 


383 


Glyco_transf_8 


Glycosyl transferase family 8 


0.0028 


-40.1 
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384 


RasGEF 


RasGEF domain 


8.1e-43 


155,7 


385 


TBC 


TBC domain 


0.017 


-66.6 


389 


Glycos_transf_2 


Glycosyl transferases 


1.3e-15 


65.3 


390 


Na Ca Ex 


Sodium/calcium exchanger protein 


3.9e-105 


362.7 


391 


fo3 


Fibronectin type III domain 


4.1e-102 


352.6 


392 


fii3 


Fibronectin type III domain 


3.4e-45 


163.6 


393 


fii3 


Fibronectin type IQ domain 


3.4e-45 


163.6 


394 


ldl_recept_b 


Low-density lipoprotein receptor 
repeat 


7.1e-49 


175.8 


395 


Ribosomal L30 


Ribosomal protein L30p/L7e 


0.0023 


16.0 


396 


Oxysterol_BP 


Oxysterol-binding protein 


1.5e-94 


327.5 


.397 


RDS ROMl 


Peripherin/rom-l 


2.9e-33 


123.9 


399 


lactamase B 


Metallo-beta-lactamase superfamily 


3.4e-39 


143.6 


402 


F-box 


F-box domain. 


0.0002 


28.1 


403 


CLP_protease 


Clp protease 


4.8e-64 


226.2 


405 


Ribosomal L35 
Ae 


Ribosomal protein L35Ae 


6e-77 


269.0 


406 . 


LIM 


LIM domain containing proteins 


0.00021 


20.7 


410 


tRNA-synt^lc 


tRNA synthetases class I (E and Q) 


le-236 


799.8 


411 


NTP transf 2 


Nucleotidyltransferase domain 


3.9e-16 


67.0 


412 


DEAD 


DEAD/DEAH box helicase 


0.00016 


17.2 


414 


DUF94 


Domain of unknown function DUF94 


0.00011 


26.9 


415 


tubulin 


Tubulin/FtsZ family 


4.5e-289 


973.7 


420 


SET 


SET domain 


3.3e-57 


203.5 


421 


WD40 


WD domain, G-beta repeat 


6.1e-29 


109.6 


423 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-39 


144.9 


424 


pkinase 


Eukaryotic protein kinase domain 


8.9e-75 


261.8 


428 


LIM 


LIM domain containing proteins 


1.8e-34 


126.7 


431 


kazal 


Kazal-type serine protease inhibitor 
domain 


3.7e-18 


73.8 


432 


SH2 


Src homology domain 2 


1.4e-67 


198.4 


433 


zf-C2H2 


Zinc fmger, C2H2 type 


2.8e-144 


492.7 


434 


ras 


Ras family 


0.012 


-106,8 


436 


E1.E2 ATPase 


E1-E2 ATPase 


1.6e-117 


391.0 


437 


RNAjJol A 


RNA polymerase alpha subunit 


0 


1077.7 


438 


PHD 


PHD-fmger 


L6e-ll 


51.7 


439 


lectin c 


Lectin C-type domain 


4.7e-30 


113.3 


440 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-65 


231.6 


441 


arrestin 


Arrestin (or S-antigen) 


2.9e-254 


858.1 


442 


aminotran_3 


Aminotransferases class-Hi 
pyridoxal-pho 


8.2e-80 


231.1 


443 


UCH-1 


Ubiquitin carboxyl-terminal 
hydrolases famil 


8.5e-12 


52.6 


444 


CTF NFI 


CTF/NF-I family 


2.6e-277 


934.6 


451 


T-box 


T-box 


3.8e-117 


402.6 


453 


Rleske 


Rieske [2Fe-2S] domain 


2.6e.l3 


57.7 


454 


zf-C2H2 


Zinc fmger, C2H2 type 


3.9e-64 


226.5 


456 


homeobox 


Homeobox domain 


2.8e-08 


38.9 


459 


ig 


Immunoglobulin domain 


2.6e-20 


70.5 


460 


Hydrolase 


haloacid dehalogehase-like hydrolase 


4e-25 


96.9 


462 


rve 


Integrase core domain 


1.6e-13 


50.7 


466 


CH 


Calponm homology (CH) domain 


2.4e-17 


71.1 


467 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


468 


Sterol desat 


Sterol desaturase 


7,5e-38 


139.2 


469 


pro_isomerase 


Cyclophilin type peptidyl-prolyl cis- 
tr 


2.6e-63 


220.9 


470 


Peptidase M24 


metallopeptidase family M24 


6e-08 


28.1 


471 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


5.4e-129 


441.9 
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472 


myb^DNA- 
binding 


Myb-like DNA-binding domain 


3.6e-06 


33.9 


473 


ZZ 


Zinc finger present in dystrophin, CB 


0.012 


20.0 


474 


EFlG_domam 


Elongation factor 1 gamma, 
conserved doma 


6.3e-88 


305.5 


475 


RlbosomalJL31e 


Ribosomal protein L31e 


6.1e-66 


232.5 


476 


Clq 


Clq domain 


2.5e-75 


263.7 


477 


SH3 


SH3 domain 


l.le-12 


55.6 


478 


MoaA Niffi Pq 
qE 


moaA / niffi / pqqE family 


0.002 


-17.7 


479 


FYVE 


FYVE zinc finger 


9.3e-21 


78.6 


480 


DNAjoLA 


DNA polymerase family A 


2.3e-46 


167.4 


482 


adh short 


short cham dehydrogenase 


1.2e-62 


221.6 


483 


ank 


Ank repeat 


1.3e-17 


71.9 


484 


IMS 


impB/mucB/samB family 


2.2e-83 


290.5 


486 


TIR 


TIR domain 


3.2e-19 


67.8 


487 


FMO-like 


Flavin-binding monooxygenase-like 


0 


1425.5 


488 


I_LWEQ 


1/LWEQ domain 


9.5e-101 


341.0 


495 


homeobox 


Homeobox domain 


3.6e-06 


30.8 


497 


pkinase 


Eukaryotic protein kinase domain 


23e-166 


566.1 


499 


fia3 


Fibronectin type III domain 


2,5e-237 


801.8 


501 


LRR 


Leucine Rich Repeat 


9.3e-31 


115.6 


502 


RGS 


Regulator of G protein signaling 
domain 


0.041 


11.9 


503 


filament 


Intermediate filament proteins 


le-142 


487.5 


505 


fii3 


Fibronectin type III domain 


1.3e.lOO 


347.7 


506 


HECT 


HECT-domain (ubiquitin- 

transferase). 


le-13 


59.0 


507 


Ribosomal_L7A 
e 


Ribosomal protein L7Ae 


5.7e-26 


99.7 


508 


WD40 


WD domain, G-beta repeat 


0.063 


19,8 


509 


WD40 


WD domain, G-beta repeat 


0.063 


19.8 


510 


WD40 


WD domain, G-beta repeat 


2.1e-42 


154.3 


511 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 


300.4 


512 


G-gamma 


GGL domain 


1.9e-08 


34.3 


513 


SH3 


SH3 domain 


3e-06 


34.2 


515 


HTH_AraC 


Bacterial regulatory helix-tum-helix 
protei 


3.9e-27 


103,6 


516 


zf-C2H2 


Zinc fmger, C2H2 type 


1.7e-34 


128.0 


517 


SI 


SI RNA binding domain 


6.1e-58 


205.9 


518 


pkinase 


Eukaryotic protein kinase domain 


1.8e-75 


264.2 


525 


cadherin 


Cadherin domain 


2e-80 


280.6 


528 


zf-C2H2 


Zinc fmger, C2H2 type 


4e-70 


246.4 


529 


neur chan 


Neurotransmitter-gated ion-channel 


5.8e-222 


750.8 


531 


RhoGEF 


RhoGEF domain 


3.5e-44 


160.2 


532 


myosin_head 


Myosin head (motor domain) 


0 


1494.5 


533 


LRR 


Leucine Rich Repeat 


8.3e-15 


62.6 


535 


Sec7 


Sec7 domain 


5.1e-92 


319.1 


536 


homeobox 


Homeobox domain 


4.8e-05 


26.4 


539 


actin 


Actin 


2.4e-100 


330.6 


542 


ank 


Ank repeat 


1.9e.35 


131.2 


544 


zf-CCCH 


Zinc fmger C-x8-C-x5-C-x3-H type 


2.8e-10 


41.7 


546 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2.4e-40 


147.4 


547 


HMG_CoA_synt 


Hydroxymethylglutaryl-coenzyme A 

synthas 


0 


1250.8 


549 


laminin G 


Laminin G domain 


3.3e-76 


266.6 


551 


PHD 


PHD-finger 


0.008 


9,3 


552 


PDZ 


PDZ domain (Also known as DHR or 


0.0017 


25.0 
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GLGF). 






555 


WW 


WW domain 


1.3e-24 


95.3 


558 


kinesin 


Kinesin motor domain 


1.8e-176 


599.7 


559 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
fmger) 


0.00085 


16.5 


563 


efhand 


EF hand 


7.9e-ll 


49.4 


567 


PH 


PH domain 


7.8e-06 


25.9 


568 


PH 


PH domain 


3.1e-39 


143.8 


569 


Hist deacetyl 


Histone deacetylase family 


5,2e-106 


365.6 


570 


PDZ 


PDZ domain (Also known as DHR or 

GLGF). 


3.4e-20 


80.5 


571 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


le-16 


58.5 


573 


ubiquitin 


Ubiquitin family 


1.4e-08 


31.1 


574 


FH2 


Formin Homology 2 Domain 


1.3e-110 


380.9 


576 


serpin 


Serpins (serine protease inhibitors) 


4.3e-146 


496.4 


579 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-76 


265.8 


580 


pkinase 


Eukaryotic protein kinase domain 


6.9e-79 


275.5 


581 


RhoGAP 


RhoGAP domain 


4.4e-53 


189.8 


582 


Ribosoinal_L7A 
e 


Ribosomal protein L7Ae 


0.028 


1.0 


584 


kazal 


Kazal-type serine protease inhibitor 
domain 


2.2e-52 


187.4 


585 


LRR 


Leucine Rich Repeat 


4.4e-28 


106.7 


586 


PHD 


PHD-finger 


3.8e-12 


53.8 


588 


GTP1_0BG 


GTPl/OBG family 


l.le-62 


215.2 


590 


Collagen 


Collagen triple helix repeat (20 
copies) 


8e-42 


152.4 


591 


lys 


C-type lysozyme/alpha-lactalbumin 
family 


1.6e-31 


116.4 


596 


ACBP 


Acyl CoA binding protein 


0.0022 


-9.4 


597 


SNF2 N 


SNF2 and others N-terminal domain 


3.7e-98 


339.5 


600 


KRAB 


KRAB box 


1.3e-29 


111.8 


606 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


607 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


608 


WD40 


WD domain, G-beta repeat 


5.3e-23 


89.8 


610 


cpn60 TCPl 


TCP-l/cpn60 chaperonin family 


L7e-237 


802.4 


613 


THF DHG CY 

H 


Tetrahydrofolate 

dehydrogenase/cyclohydro 


4.9e-173 


588.3 


617 


rrm 


RNA recognition motif. 


4e44 


60.4 


618 


rrm 


RNA recognition motif. 


4e-14 


60.4 


620 


cofilin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


3e-06 


34.2 


621 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


622 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


5.8e.21 


83.1 


625 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-124 


426.4 


628 


DEAD 


DEAD/DEAH box helicase 


2.5e-68 


219.0 


632 


GST 


Glutathione S-transferases. 


4.8e-26 


89.0 


633 


5_nucleotidase 


5 '-nucleotidase 


6.6e-248 


837.0 


636 


LIM 


LIM domain containing proteins 


1.6e-88 


307.5 


637 


pkinase 


Eukaryotic protein kinase domain 


L5e-73 


257.8 


638 


MSP_domain 


MSP (Major sperm protein) domain 


8.4e-09 


42.7 


639 


metalthio 


Metallothionein 


2e-24 


94.6 


641 


zf-C2H2 


Zinc finger, C2H2 type 


6.1e-114 


391.9 


642 


Ribosomal S28e 


Ribosomal protein S28e 


9.3e-48 


172.1 


643 


Ribosomal S5 


Ribosomal protein S5 


8.3e-87 


301.8 


646 


PHD 


PHD-finger 


0.00025 


23.1 


647 


WD40 


WD domam, G-beta repeat 


L5e-22 


88.4 
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648 


Lipase^GDSL 


Lipase/Acylhydrolase with GDSL- 
Jike motif 


0,015 


2.2 


652 


zf-C2H2 


Zinc finger, C2H2 type 


4.1e-146 


498.8 


653 


histone 


Core histone H2A/H2B/H3/H4 


1.2e-10 


48.8 


654 


•zf-C2H2 


Zinc finger, C2H2 type 


1.9e-87 


303.9 


655 


ras 


Ras family 


6.4e-77 


269.0 


657 


zf-C3HC4 


Zinc fmger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


658 


STphosphatase 


Ser/Thr protein phosphatase 


2.6e-182 


619.1 


659 


zf-C2H2 


Zinc finger, C2H2 type 


L3e-92 


321.1 


660 


zf-C2H2 


Zinc finger, C2H2 type 


L5e-85 


297.6 


662 


NDK 


Nucleoside diphosphate kinases 


1.4e-119 


410.7 


664 


IRF 


Interferon regulatory factor 
transcription f 


7e-20 


79.5 


665 


4HPPD_C 


4-hydroxyphenylpyruvate 

dioxygenase C term 


1.4e-16 


68.5 


666 


DEAD 


DEAD/DEAH box helicase 


4.8e-74 


237,1 


667 


DEAD 


DEAD/DEAH box helicase 


2.9e-70 


225.1 


669 


pkinase 


Eukaryotic protein kinase domain 


6.1e-93 


322.2 


671 


homeobox 


Homeobox domain 


0.018 


16.5 


678 


crystall 


Beta/Gamma crystallin 


4.7e-106 


365.8 


679 


WD40 


WD domain, G-beta repeat 


1.9e-06 


34.9 


680 


Keratin B2 


Keratin, high sulfiir B2 protein 


4.1e-06 


15.9 


682 


G-gamma 


GGL domain 


8.5e-33 


117.9 


685 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


I.4e-29 


111.7 


686 


Acetyltransf 


Acetyltransferase (GNAT) family 


6.6e-10 


46.4 


687 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


4.6e-15 


50.0 


688 


proteasome 


Proteasome A-type and B-type 


6.5e-64 


225.7 


689 


SCP2 


SCP-2 sterol transfer family 


6.2e-37 


136.1 


690 


TS-N 


TS-N domain 


0.041 


20.1 


692 


zf-C2H2 


Zinc finger, C2H2 type 


9.9e-60 


211.9 


693 


zf-MYND 


MYND finger 


0.038 


5.5 


694 


Oxysterol BP 


Oxysterol-binding protein 


3.9e-133 


455.7 


695 


PDZ 


PDZ domain (Also known as DHR or 

GLGF). 


1.3e-30 


115.1 


703 


Peptidase_C2 


Calpain family cysteine protease 


2.3e-175 


596.0 


706 


filament 


Intermediate filament proteins 


7.2e-107 


368.5 


710 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-term 


7e-80 


278.0 


711 


SH2 


Src homology domain 2 


23e-65 


192.1 


712 


ATP-synt DE 


ATP synthase, Delta/Epsilon chain 


0.00062 


19.0 


713 


ARID 


ARID DNA bindmg domam 


2e-17 


71.3 


714 


LBP BPI CETP 


LBP /BPI /CETP family 


8.6e.34 


125.7 


715 


RNAjolJ. 


RNA polymerases L / 13 to 16 kDa 
subunit 


4.8e-49 


176.3 


716 


KRAB 


KRAB box 


L3e-42 


155.0 


717 


mito carr 


Mitochondrial carrier proteins 


4.8e-38 


133.3 


719 


Gal-bind lectin 


Vertebrate galactoside-binding lectin 


1.5e-25 


90.2 


726 


aldedh 


Aldehyde dehydrogenase family 


1.3e-119 


410.8 


728 


Glycos_transf_2 


Glycosyl transferases 


4e-21 


83.6 


734 


ELM2 


ELM2 domain 


2e-34 


127.8 


735 


PR55 


Protein phosphatase 2A regulatory 
subunit PR 


0 


1038.2 


737 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4e.l4 


60.4 


740 


WD40 


WD domain, G-beta repeat 


5.6e-14 


59.9 


745 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 


3.8e-13 


46.9 
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finger) 






749 


mito_carr 


Mitochondrial carrier proteins 


4.5e-67 


232.8 


750 


DUF27 


Domain of unknown function DUF27 


4.5e-12 


53.5 


751 


SH3 


SH3 domain 


3.6e-17 


70.5 


752 


HMG box 


HMG (high mobility group) box 


8.6e-13 


55.9 


753 


SPRY 


SPRY domain 


5.9e-05 


23.3 


754 


GTP CDC 


Cell division protein 


7.5e-153 


521.2 


755 


mito carr 


Mitochondrial carrier proteins 


3e-88 


305.4 


756 


TSPN 


Thrombospondin N-terminal -like 
domains 


8.1e-58 


205.5 


757 


BTB 


BTB/POZ domain 


5.7e-23 


89.7 


759 


zf-C2H2 • 


Zinc finger, C2H2 type 


1.2e-12 


55.4 


760 


NSF 


NSF attachment protein 


6.4e-127 


435.1 


762 


Ribosomal_S14 


Ribosomai protein S14p/S29e 


2.1e-06 


24.8 


765 


ThiF^family 


ThiF family 


L7e-39 


144.6 


766 


DnaJ 


DnaJ domain 


3.9e-36 


133.5 


768 


tRNA-synt__2b 


tRNA synthetase class II 


9.1e-81 


281.7 


769 


ldl_recept_a 


Low-density lipoprotein receptor 
domain 


0 


1404.5 


770 


WD40 


WD domain, G-beta repeat 


2e-21 


84.6 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


774 


SNF2 N J 


SNF2 and others N-terminal domain 


5.5e-99 


342.3 


776 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


777 


VPS9 


Vacuolar sortmg protein 9 (VPS9) 
domain 


l.le-30 


115.4 


778 


VPS9 


Vacuolar sorting protein 9 (VPS9) 

domain 


l.le-30 


115.4 


779 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-08 


31.0 


781 


cadherin 


Cadherin domain 


5.6e-li3 


388.7 


783 


HECT 


HECT-domain (ubiquitin- 
transferase). 


4.2e-31 


116.8 


785 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


786 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


788 


vwa 


von Willebrand factor type A domain 


L9e-52 


187.7 


790 


mn 


RNA recognition motif. 


2.8e-20 


80.8 


791 


Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


792 


pkinase 


Eukaryotic protein kinase domain 


0.023 


12.4 


795 


zf-C2H2 


Zinc finger, C2H2 type 




328.7 


796 


adh short 


short chain dehydrogenase 


4.1e-05 


-7.3 


799 


SAICAR__synt 


SAICAR synthetase 


6e-125 


428.5 


805 


WD40 


WD domain, G-beta repeat 


4e-65 


229.8 


806 


ZU5 


ZU5 domain 


4.7e-37 


136.5 


807 


WD40 


WD domain, G-beta repeat 


0.016 


21.8 


808 


WD40 


WD domain, G-beta repeat 


0.0041 


23.8 


809 


pkinase 


Eukaryotic protein kinase domain 


2e-31 


117^ 


810 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


814 


zf-C2H2 


Zinc finger, C2H2 type 


4.5e-83 


289.4 


815 


zf-C2H2 


Zinc finger, C2H2 type 


6e-74 


259.1 


817 


myosin head 


Myosm head (motor domain) 


1.5e-176 


599.9 


818 


GSPII_E 


Bacterial type II secretion system 
protein 


0.012 


11.5 


819 


PDEase 


3'5'-cyclic nucleotide 
phosphodiesterase 


l.le-74 


215.5 


821 


PH 


PH domain 


0.00025 


20.5 


822 


CNH 


CNH domain 


0.00015 


-24.7 


827 


mn 


RNA recognition motif. 


1.5e-06 


35^ 
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829 


HMG box 


HMG (high mobility group) box 


7.86-34 


125.8 


830 


RasGEF 


RasGEF domain 


2.2e-102 


353:5 


831 


CNH 


CNH domain 


3e-118 


406.2 


832 


mitocarr 


Mitochondrial carrier proteins 


3,7e-37 


130.3 


833 


FX 


FX domain 


2.7e-19 


77.5 


837 


Yjhosphatase 


Protein-tyrosine phosphatase 


L6e-263 


888.8 


838 


auk 


Ank repeat 


2.4e-270 


91L5 


840 


ank 


Ank repeat 


5.8e-38 


139.6 


842 


Ribosomal_L15e 


RibosomalL15 


4.8e-131 


448.8 


843 


SNF 


Sodium:neurotransmitter symporter 
family 


0 


1201.8 


845 


Peptidase M16 


Insulinase (Peptidase family Ml 6) 


4.7e-67 


236.2 


848 


EFIBD 


EF-1 guanine nucleotide exchange 
domain 


2.2e-56 


200.7 


849 


zf-C2H2 


Zinc finger, C2H2 type 


L5e.l22 


420.5 


850 


zf-C2H2 


Zinc finger, C2H2 type 


2e.67 


237.4 


852 


SIS 


SIS domain 


3.8e-30 


113.6 


853 


RhoGAP 


RhoGAP domain 


l.le-37 


138.6 


854 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


5.le-I0 


46.7 


856 


ACOX 


Acyl-CoA oxidase 


9.1e-263 


886.3 


858 


efhand 


EFhand 


2.4e-18 


74.4 


860 


homeobox 


Homeobox domain 


4e-22 


86.9 


862 


TFIIF^beta 


Transcription initiation factor IIF, 
beta 


2.2e-134 


459.8 


866 


A2M 


Alpha-2-macroglobulin family 


4.9e-21 


70.9 


867 


MoCF_biosynth 


Molybdenum cofactor biosynthesis 

protei 


5.8e-205 


694.3 


868 


EGF 


EGF-like domain 


4.1e-22 


86.9 


869 


EGF 


EGF-like domain 


l.le-22 


88.8 


871 


PI-PLC-X 


Phosphatidylinositol-specific 
phospholipase 


7.2e-95 


328.6 


872 


yCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


l.le-20 


82.1 


874 


SH3 


SH3 domain 


2.2e-14 


61.2 


877 


SH3 


SH3 domain 


8.6e-90 


311.7 


882 


KRAB 


KRAB box 


6.9e-45 


162.6 


885 


ank 


Ank repeat 


7.1e-07 


36.3 


886 


biopterm_H 


Biopterin-dependent aromatic amino 
acidh 


0 


988.3 


887 


GTP EFTU 


Elongation factor Tu family 


4.9e-129 


437.5 


888 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


1.6e-14 


51.4 


889 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-92 


319,6 


890 


ig 


Immunoglobulin domain 


3.8e-06 


24.8 


892 


PTR2 


POT family 


9.5e-48 


163.0 


893 


Sulfatase 


Sulfatase 


3.5e-78 


273.2 


894 


Sulfatase 


Sulfatase 


3.5e-78 


273.2 


895 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


4.5e-51 


164.4 


896 


Glyco_hydro_3 1 


Glycosyl hydrolases family 31 


0 


1277.3 


897 


chromo 


'chromo' (CHRromatin Organization 
Modifier) 


3.9e^6 


26.0 


898 


CbLN 


CBL proto-oncogene N-tenxiinal 
domain 


1.2e-273 


922.4 


899 


vwa 


von Willebrand factor type A domain 


5.5e-32 


119.7 


900 


WD40 


WD domain, G-beta repeat 


2.7e-07 


37.7 


901 


zf-C2H2 


Zinc finger, C2H2 type 


4e-156 


532.1 


903 


ras 


Ras family 


6.6e-101 


348.6 
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904 


Annadillo seg 


Ami ad illo/beta-caten in - 1 ike repeats 


l.le-06 


35.6 


906 


FH2 


Formin Homology 2 Domain 


4.5e-112 


385.7 


907 


Cytidylyltransf 


Cytidylyitransferase 


1.4e-05 


29.3 


908 


pkinase 


Eukaryotic protein kinase domain 


1.2e.64 


228.2 


909 


pkinase 


Eukaryotic protein kinase domain 


8.5e-70 


245.3 


910 


pkinase 


Eukaryotic protein kinase domain 


2.9e-42 


153.8 


911 


pkinase 


Eukaryotic protein kinase domain 


1.2e-35 


131.8 


912 


PHD 


PHD-fmger 


5.16-06 


33.4 


913 


PHD 


PHD-finger 


5.5e-16 


66.5 


916 


filament 


Intermediate filament proteins 


9.7e-121 


414.5 


917 


LIM 


LIM domain containing proteins 


5.9e-15 


57.9 


918 


SAM 


SAM domain (Sterile alpha motif) 


4.3e-I6 


66.9 


922 


Acylphosphatase 


Acylphosphatase 


2.96-63 


223.6 


924 


ig 


Immunoglobulin domain 


1.3e-08 


32.8 


925 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase 


2.4e-I31 


449.8 


927 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


2,9e-45 


145.9 


928 


globin 


Globin 


2.4e-52 


186.9 


929 


sugar_tr 


Sugar (and other) transporter 


].2e-16 


68.8 


932 


Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


933 


HMG box 


HMG (high mobility group) box 


7.8e-34 


125.8 


934 


SEA 


SEA domain 


0.0021 


24.7 


935 


ras 


Ras family 


6.4e-59 


209.2 


936 


CH 


Calponm homology (CH) domain 


3.8e-21 


83.7 


937 


voltage_CLC 


Voltage gated chloride channels 


1.9e-199 


676.0 


938 


homeobox 


Homeobox domain 


1.9e-25 


98.0 


940 


pkinase 


Eukaryotic protein kinase domain 


9.9e-58 


205.2 


942 


Myosin_tail 


Myosin tail 


3.7e-09 


38.2 


943 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-92 


320.3 


945 


Clat_adaptor_s 


Clathrin adaptor complex small chain 


1.3e-76 


268.0 


946 


sugar_tr 


Sugar (and other) transporter 


0.017 


-122.8 


947 


tRNA-synt le 


tRNA synthetases class I (C) 


0.00097 


15,6 


948 


PHD 


PHD-finger 


2.2e-17 


71.2 


951 


sugar_tr 


Sugar (and other) transporter 


0.0082 


-113.9 


952 


mito^carr 


Mitochondrial carrier proteins 


1.7e-54 


189.7 


953 


myb^DNA- 
binding 


Myb-like DNA-binding domain 


4.5e-20 


80.1 


955 


ketoacyl-synt 


Beta-ketoacyl synthase 


7.1e-133 


454.8 


957 


aldo ket red 


Aldo/keto reductase family 


1.5e-98 


340.8 


959 


Kelch 


Kelch motif 


0.02 


20.8 


961 


ras 


Ras family 


2.26-29 


111.1 


964 


homeobox 


Homeobox domain 


5.4e-22 


86.5 


965 


PH 


PH domain 


3e-21 


80.9 


966 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


2.2e-09 


34.7 


967 


Ribosomal L29 


Ribosomal L29 protein 


1.6e-15 


65.0 


970 


FAD_binding_2 


FAD binding domain 


8.9e-47 


166.6 


971 


rve 


Integrase core domain 


0.00015 


19.8 


972 


Glycos_transf_2 


Glycosyl transferases 


2.16-21 


84.5 


974 


Ribosomal LIO 


Ribosomal protein LIO 


3.3e-48 


173.6 


975 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


1.6e-37 


121.3 


976 


zf-C4 


Zinc finger, C4 type (two domains) 


2.1e-52 


178.5 


977 


2f-C2H2 


Zinc finger, C2H2 type 


6.6e- 150 


511.4 


978 


FTHFS 


Formate-tetrahydrofolate ligase 


0 


1367.2 


982 


Renal_dipeplase 


Renal dipeptidase 


1.3e-73 


258.0 


984 


A deaminase 


Adenosine/AMP deaminase 


2.6e-05 


-48.6 
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TABLES 



oiLv2 iNU: 

fiillolpnfrt'h 

nucleotide 
sequence 


obVi! I*' 
NO: of 
full-lengtb 
peptide 
sequence 


nf contip ■ 
nucleotide 
sequence 


OlbVZ JUL! iNU! 

peptide 
sequence 


Priority docket 

niitnhpr rnrrp^nnnHin 

g SEQ ID NO: in 
priority application 


omlki 11/ INU: in 
U.S.S.N. 09/496.914 


I 


985 


1969 


2953 


787CIP2„1 


150 


2 


986 


1970 


2954 


787CIP2_2 


223 


3 


987 


1971. 


2955 


787CIP2_3 


1884 


4 


988 


1972 


2956 


787CIP2_4 


2123 


5 


989 


1973 


2957 


787CIP2_,5 


2313 


6 


990 


1974 


2958 


787CIP2„6 


3284 


7 


991 


1975 


2959 


787CIP2_7 


3324 


8 


992 


1976 


2960 


787CIP2J 


6182 


9 


993 


1977 


2961 


787CIP2 9 


6210 


10 


994 


1978 


2962 


787CIP2_10 


6213 


11 


995 


1979 


2963 


787CIP2 11 


6257 


12 


996 


1980 


2964 


787CIP2_,12 


6294 


13 


997 


1981 


2965 


787CIP2_13 


6294 


14 


998 


1982 


2966 


787CIP2_14 


6330 


15 


999 


1983 


2967 


787CIP2_15 


6364 


16 


1000 


1984 


2968 


787CIP2 16 


6455 


17 


1001 


1985 


2969 


787CIP2 17 


6486 


18 


1002 


1986 


2970 


787CIP2 18 


6503 


19 


1003 


1987 


2971 


787CIP2_19 


6528 


20 


. 1004 


1988 


2972 


787CIP2 20 


6572 


21 


1005 


1989 


2973 


787CIP2__21 


6578 


22 


1006 


1990 


2974 


787CIP2_22 


6593 


23 


1007 


1991 


2975 


787CIP2_23 


6603 


24 


1008 


1992 


2976 


787CIP2 24 


6603 


25 


1009 


1993 


2977 


787CIP2 25 


6679 


26 


1010 


1994 


2978 


787CIP2 26 


6744 


27 


1011 


1995 


2979 


787CIP2 27 


6762 


28 


1012 


1996 


2980 


787CIP2_28 


6770 


29 


1013 


1997 


2981 


787CIP2 29 


6770 


30 


1014 


1998 


2982 


787CIP2 30 


6787 


31 


1015 


1999 


2983 


787CIP2 31 


6858 


32 


1016 


2000 


2984 


787CIP2 32 


6866 


33 


1017 


2001 


2985 


787CIP2 33 


6938 


34 


1018 


2002 


2986 


787CIP2 34 


6938 


35 


1019 


2003 


2987 


787CIP2 35 


6977 


36 


1020 


2004 


2988 


787CIP2 36 


7001 


37 


1021 


2005 


2989 


787CIP2 37 


7002 


38 


1022 


2006 


2990 


787CIP2 38 


7004 


39 


1023 


2007 


2991 


787CIP2 39 


7005 


40 


1024 


2008 


2992 


787CIP2 40 


7006 


41 


1025 


2009 


2993 


787CIP2 41 


7008 


42 


1026 


2010 


2994 


787C1P2 42 


7014 


43 


1027 


2011 


2995 


787CIP2 43 


7021 


44 


1028 


2012 


2996 


787CIP2 44 


7022 


45 


1029 


2013 


2997 


787C1P2 46 


7057 


46 


1030 


2014 


2998 


787CIP2 47 


7058 


47 


1031 


2015 


2999 


787CIP2_49 


7088 


48 


1032 


2016 


3000 


787CIP2^50 


7089 


49 


1033 


2017 


3001 


787CIP2 51 


7182 


50 


1034 


2018 


3002 


787CIP2_52 


7489 


51 


1035 


2019 


3003 


787CIP2 53 


7564 


52 


1036 


2020 


3004 


787CIP2 54 


7566 


53 


1037 


2021 


3005 


787CIP2 55 


7587 
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54 


1038 


2022 


3006 


787CIP2 56 


7591 


55 


1039 


2023 


3007 


787CIP2 57 


7600 


56 


1040 


.2024 


3008 


787CIP2_58 


7604 


57 


1041 


2025 


3009 


787CIP2 59 


7612 


58 


1042 


2026 


3010 


787CIP2 60 


7613 


59 


1043 


2027 


3011 


787CIP2 61 


7615 


60 


1044 


2028 


3012 


787CIP2 62 


7616 


61 


1045 


2029 


3013 


787CIP2 63 


7617 


62 


1046 


2030 


3014 


787CIP2 64 


7623 


63 


1047 


2031 


3015 


787CIP2 65 


7625 


64 


1048 


2032 


3016 


787CIP2_66 


7625 


65 


1049 


2033 


3017 


787CIP2_67 


7630 


66 


1050 


2034 


3018 


787CIP2_68 


7638 


67 


1051 


2035 


3019 


787CIP2_69 


7640 


68 


1052 


2036 


3020 


787aP2_70 


7670 


69 


1053 


2037 


3021 


787CIP2_71 


7676 


70 


1054 


2038 


3022 


787CIP2_72 


7688 


71 


1055 


2039 


3023 


787CIP2 73 


7690 


72 


1056 


2040 


3024 


787CIP2_74 


7700 


73 


1057 


2041 


3025 


787CIP2 75 


7774 


74 


1058 


2042 


3026 


787CIP2 76 


7784 


75 


1059 


2043 


3027 


787CIP2 77 


7785 


76 


1060 


2044 


3028 


787CIP2 78 


7792 


77 


1061 


2045 


3029 


787CIP2 79 


7798 


78 


1062 


2046 


3030 


787CIP2_80 


7807 


79 


1063 


2047 


3031 


787CIP2_81 . 


7810 


80 


1064 


2048 


3032 


787CIP2_82 


7812 


81 


1065 


2049 


3033 


787CIP2_83 


7816 


82 


1066 


2050 


3034 


787CIP2 84 


7826 


83 


1067 


2051 


3035 


787CIP2 85 


7842 


84 


1068 


2052 


3036 


787C1P2_86 


7850 


85 


1069 


2053 


3037 


787CIP2_87 


7865 


86 


1070 


2054 


3038 


787CIP2_88 


7882 


87 


1071 


2055 


3039 


787CIP2_89 


7891 


88 


1072 


2056 


3040 


787CIP2 90 


7892 


89 


1073 


2057 


3041 


787CIP2 9.1 


7896 


90 


1074 


2058 


3042 


787CIP2 92 


7896 


91 


1075 


2059 


3043 


787CIP2 93 


7907- 


92 


1076 


2060 


3044 


787CIP2_94 


7913 


93 


1077 


2061 


3045 


787C1P2_95 


7914 


94 


1078 


2062 


3046 


787C1P2_96 


7915 


95 


1079 


2063 


3047 


787C1P2 97 


7920 


96 


1080 


2064 


3048 


787CIP2 98 


7921 


97 


1081 


2065 


3049 


787CIP2 99 


7924 


98 


1082 


2066 


3050 


787CIP2 100 


7927 


99 


1083 


2067 


3051 


787CIP2_I01 


7929 


100 


1084 


2068 


3052 


787CIP2_102 


7937 


101 


1085 


2069 


3053 


787CIP2 103 


7940 


102 


1086 


2070 


3054 


787C1P2 104 


7942 


103 


1087 


2071 


3055 


787CIP2 105 


7944 


104 


1088 


2072 


3056 


787C1P2_106 


7951 


105 


1089 


2073 


3057 


787C1P2_107 


7951 


106 


1090 


2074 


3058 


787CIP2 108 


7962 


107 


1091 


2075 


3059 


787CIP2_109 


7964 


108 


1092 


2076 


3060 


787aP2 110 


7977 


109 


1093 


2077 


3061 


787CIP2_111 


7978 


110 


1094 


2078 


3062 


787C1P2_1 12 


7980 


111 


1095 


2079 


3063 


787CIP2_n3 


7982 


112 


1096 


2080 


3064 


787C1P2 114 


8000 


113 


1097 


2081 


3065 


787C1P2 115 


8003 
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114 


1098 


2082 


3066 


787CIP2__116 


8004 


115 


. 1099 


2083 


3067 


787CIP2_117 


8007 


116 


1100 


2084 


3068 


787CIP2 118 


8008 


117 


1101 


2085 


3069 


787CIP2_119 


8009 


118 


1102 


2086 


3070 


787CIP2_120 


8013 


119 


1103 


2087 


3071 


787CIP2 121 


8017 


120 


1104 


2088 


3072 


787CIP2_122 


8018 


121 


1105 


2089 


3073 


787CIP2_123 


8021 


122 


1106 


2090 


3074 


787CIP2_124 


8022 


123 


1107 


2091 


3075 


787CIP2_^I25 


8023 


124 


1108 


2092 


3076 


787CIP2 126 


8023 


125 


1109 


2093 


3077 


787CIP2_127 


8024 


126 


1110 


2094 


3078 


787CIP2 128 


8026 


127 


1111 


2095 


3079 


787CIP2 129 


8028 


128 


1112 


2096 


3080 


787CIP2 130 


8036 


129 


1113 


2097 


3081 


787CIP2_131 


8038 


130 


1114 


2098 


3082 


787CIP2_132 


8045 


131 


1115 


2099 


3083 


787CIP2__133 


8045 • 


132 


1116 


2100 


3084 


787CIP2 134 


8048 


133 


1117 


2101 


3085 


787CIP2 135 


8048 


134 


1118 


2102 


3086 


787CIP2 136 


8052 


135 


1119 


2103 


3087 


787CIP2 137 


8053 


136 


1120 


2104 


3088 


787CIP2 138 


8055 


137 


1121 


2105 


3089 


787CIP2 139 


8059 


138 


1122 


2106 


3090 


787CIP2_140 


8061 


139 


1123 


2107 


3091 


787CIP2_141 


8062 


140 


1124 


2108 


3092 


787CIP2_142 


8063 


141 


1125 


2109 


3093 


787CIP2 143 


8064 


142 


1126 


2110 


3094 


787CIP2 144 


8065 


143 


1127 


2111 


3095 


787CIP2_145 


8068 


144 


1128 


2112 


3096 


787CIP2 146 


8069 


145 


1129 


2113 


3097 


787CIP2 147 


8070 


146 


1130 


2114 


3098 


787CIP2 148 


8074 


147 


1131 


2115 


3099 


787CIP2 149 


8076 


148 


1132 


2116 


3100 


787CIP2 150 


8077 


149 


1133 


2117 


3101 


787CIP2_151 


8078 


150 


1134 


2118 


3102 


787CIP2 152 


8079 


151 


1135 


2119 


3103 


787CIP2 153 


8087 


152 


1136 


2120 


3104 


787CIP2 154 


8091 


153 


1137 


2121 


3105 


787CIP2_155 


8100 


154 


1138 


2122 


3106 


787C1P2_156 


8105 


155 


1139 


2123 


3107 


787CIP2 157 


8106 


156 


1140 


2124 


3108 


787CIP2 158 


8108 


157 


1141 


2125 


3109 


787CIP2 159 


8109 


158 


1142 


2126 


3110 


787CIP2^160 


8110 


159 


1143 


2127 


3111 


787CIP2 161 


8112 


160 


1144 


2128 


3112 


787CIP2^162 


8116 


161 


1145 


2129 


3113 


787CIP2 163 


8118 


162 


1146 


2130 


3114 


787CIP2 164 


8124 


163 


1147 


2131 


3115 


787CIP2 165 


8125 


164 


1148 


2132 


3116 


787CIP2 166 


8127 


165 


1149 


2133 


3117 


787CIP2 167 


8132 


166 


1150 


2134 


3118 


787CIP2 168 


8135 


167 


1151 


2135 


3119 


787CIP2 169 


8137 


168 


1152 


2136 


3120 


787CIP2 170 


8139 


169 


1153 


2137 


3121 


787CIP2J71 


8140 


170 


1154 


2138 


3122 


787CIP2_172 


8140 


171 


1155 


2139 


3123 


787CIP2_173 


8140 


172 


1156 


2140 


3124 


787CIP2_174 


8141 


173 


1157 


2141 


3125 


787CIP2 175 


8147 
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174 


1158 


2142 


3126 


787CIP2_176 


8149 


]75 


1159 


2143 


3127 


mOFlJll 


8150 


176. 


1160 


2144 


3128 


787CIP2 178 


8157 


177 


1161 


2145 


3129 


787CIP2_179 


8161 


178 


1162 


2146 


3130 


787CIP2 180 


8162 


179 


1163 


2147 


3131 


787C1P2_181 


8165 


180 


1164 


2148 


3132 


787CIP2 182 


8166 


181 


1165 


2149 


3133 


787CIP2 183 


8167 


182 


1166 


2150 


3134 


787CIP2 184 


8169 


183 


1167 


2151 


3135 


787CIP2_185 


8170 


184 


1168 


2152 


3136 


787CIP2 186 


8172 


185 


1169 


2153 


3137 


787CIP2_187 


8173 


186 


1170 


2154 


3138 


787CIP2J88 


8174 


187 


1171 


2155 


3139 


787C1P2 189 


8174 


188 


1172 


2156 


3140 


787CIP2J91 


8182 


189 


1173 


2157 


3141 


787CIP2 192 


8186 


190 


1174 


2158 


3142 


787CIP2 193 


8188 


191 


1175 


2159 


3143 


787CIP2 194 


8191 


192 


1176 


2160 


3144 


787CIP2 195 


8192 


193 


1177 


2161 


3145 


787CIP2 196 


8193 


194 


1178 


2162 


3146 


787CIP2 197 


8194 


195 


1179 


2163 


3147 


787C1P2 198 


8195 


196 


1180 


2164 


3148 


787CIP2 199 


8196 


197 


1181 


2165 


3149 


787CIP2 200 


8200 


198 


1182 


2166 


3150 


787C1P2_201 


8201 


199 


1183 


2167 


3151 


787CIP2_202 


8202 


200 . 


1184 


2168 


3152 


787CIP2_203 


8205 


201 


1185 


2169 


3153 


787CIP2_204 


8206 


202 


1186 


2170 


3154 


787CIP2_205 


8207 


203 


1187 


2171 


3155 


787CIP2 206 


8208 


204 


1188 


2172 


3156 


787CIP2 207 


8209 


205 


1189 


2173 


3157 


787CIP2_208 


8210 


206 


1190 


2174 


3158 


787C1P2_209 


8211 


207 


1191 


2175 


3159 


787CIP2 210 


8212 


208 


1192 


2176 


3160 


787CIP2_211 


8213 


209 


1193 


2177 


3161 


787CIP2_212 


8214 


210 


1194 


2178 


3162 


787CIP2_213 


8215 


211 


1195 


2179 


3163 


787CIP2_214 


8216 


212 


1196 


2180 


3164 


787CIP2_215 


8217 


213 


1197 


2181 


3165 


787CIP2_217 


8221 


214 


1198 


2182 


3166 


787CIP2 218 


8222 


215 


1199 


2183 


3167 


787CIP2 219 


8223 


216 


1200 


2184 


3168 


787CIP2 220 


8224 


217 


1201 


2185 


3169 


787CIP2 221 


8225 


218 


1202 


2186 


3170 


787CIP2 222 


8227 


219 


1203 


2187 


3171 


787CIP2 223 


8232 


220 


1204 


2188 


3172 


787CIP2_224 


8235 


221 


1205 


2189 


3173 


787CIP2 225 


8236 


222 


1206 


2190 


3174 


787CIP2 227 


8238 


223 


1207 


2191 


3175 


787CIP2 228 


8239 


224 


1208 


2192 


3176 


787CIP2 229 


8240 


225 


1209 


2193 


3177 


787CIP2 230 


8242 


226 


1210 


2194 


3178 


787CIP2_231 


8246 


227 


1211 


2195 


3179 


787CIP2_232 


8252 


228 


1212 


2196 


3180 


787CIP2_233 


8257 


229 


1213 


2197 


3181 


787CIP2 234 


8288 


230 


1214 


2198 


3182 


787CIP2_235 


8310 


231 


1215 


2199 


3183 


787CIP2_236 


8311 


232 


1216 


2200 


3184 


787CIP2_237 


8315 


233 


1217 . 


2201 


3185 


787CIP2_238 


8318 
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234 


1218 


2202 


3186 


787CIP2 239 


8326 


235 


1219 


2203 


3187 


787CIP2 240 


8326 


236 


1220 


2204 


3188 


787CIP2 241 


8336 


237 


1221 


2205 


3189 


787CIP2_242 


8351 


238 


1222 


2206 


3190 


787CIP2 243 


8364 


239 


1223 


2207 


3191 


787CIP2 244 


8372 


240 


1224 


2208 


3192 


787CIP2 245 


8376 


241 


1225 


2209 


3193 


787CIP2_246 


8377 


242 


1226 


2210 


3194 


787CIP2_247 


8382 


243 


1227 


2211 


3195 


787CIP2 248 


8404 


244 


1228 


2212 


3196 


787CIP2 249 


8410 


245 


1229 


2213 


3197 


787CIP2 250 


8419 


246 


1230 


2214 


3198 


787CIP2 251 


8430 


247 


1231 


2215 


3199 


787CIP2 252 


8448 


248 


1232 


2216 


3200 


787CIP2 253 


8458 


249 


1233 


2217 


3201 


787CIP2 254 


8461 


250 


1234 


2218 


3202 


787CIP2 255 


8466 


251 


1235 


2219 


3203 


787CIP2 256 


8468 


252 


1236 


2220 


3204 


787CIP2 257 


8477 


253 


1237 


2221 


3205 


787CIP2 258 


8481 


254 


1238 


2222 


3206 


787CIP2 259 


8491 


255 


1239 


2223 


3207 


787CIP2 260 


8503 


256 


1240 


2224 


3208 


787CIP2 261 


8513 


257 


1241 


2225 


3209 


787CIP2 262 


8514 


258 


1242 


2226 


3210 


787CIP2 263 


8518 


259 


1243 


2227 


3211 


787C1P2_264 


8547 


260 


1244 


2228 . 


3212 


787C1P2_265 


8549 


261 


1245 


2229 


3213 


787C1P2__266 


8549 


262 


1246 


2230 


3214 


787CIP2_267 


8549 


263 


1247 


2231 


3215 


787CIP2_268 


8550 


264 


1248 


2232 


3216 


787CIP2 269 


8603 


265 


1249 


2233 


3217 


787CIP2 270 


8625 


266 


1250 


2234 


3218 


787CIP2_271 


8625 


267 


1251 


2235 


3219 


787CIP2 272 


8633 


268 


1252 


2236 


3220 


787CIP2 273 


8648 


269 


1253 


2237 


3221 


787CIP2_274 


8654 


270 


1254 


2238 


3222 


787CIP2^275 


8671 


271 


1255 


2239 


3223 


787CIP2_276 


8733 


272 


1256 


2240 


3224 


787CIP2_277 


8735 


273 


1257 


2241 


3225 


787CIP2 278 


8747 


274 


1258 


2242 


3226 


787CIP2_279 


8748 


275 


1259 


2243 


3227 


787CIP2_,280 


8753 


276 


1260 


2244 


3228 


787CIP2 281 


8770 


277 


1261 


2245 


3229 


787CIP2_^282 


8777 


278 


1262 


2246 


3230 


787CIP2_283 


8828 


279 


1263 


2247 


3231 ' 


787CIP2_284 


8836 


280 


1264 


2248 


3232 


787CIP2_285 


8842 


281 


1265 


2249 


3233 


787CIP2 286 


8842 


282 


1266 


2250 


3234 


787CIP2 287 


8850 


283 


1267 


2251 


3235 


787CP2 288 


8851 


284 


1268 


2252 


3236 


787CIP2_289 


8852 


285 


1269 


2253 


3237 


787CIP2_290 


8853 


286 


1270 


2254 


3238 


787CIP2_291 


8854 


287 


1271 


2255 


3239 


787CIP2_292 


9084 


288 


1272 


2256 


3240 


787CIP2_293 


9099 


289 


1273 


2257 


3241 


787CIP2_294 


9691 


290 


1274 


2258 


3242 


787CIP2 295 


9699 


291 


1275 


2259 


3243 


787CIP2 296 


9883 


292 


1276 


2260 


3244 


787CIP2_297 


9886 


293 


1277 


2261 


3245 


787CIP2 298 


10334 
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294 


1278 


2262 


3246 


787CIP2 299 


10335 


295 


1279 


2263 


3247 


787CIP2 300 


10336 


296 


1280 


.2264 


3248 


787C1P2 301 


10338 


297 


1281 


2265 


3249 


787CIP2 302 


10339 


298 


1282 


2266 


3250 


787CIP2 304 


10342 


299 


1283 


2267 


3251 


787CIP2 305 


10342 


300 


1284 


2268 


3252 


787CIP2 306 


10343 


301 


1285 


2269 


3253 


.787CIP2 307 


10344 


302 


1286 


2270 


3254 


787CIP2 308 


10345 


303 


1287 


227.1 


3255 


787CIP2 309 


10346 


304 


1288 


2272 


3256 


787CIP2 310 


10347 


305 


1289 


2273 


3257 


787CIP2_311 


10348 


306 


1290 


2274 


3258 


787CIP2J12 


10349 


307 


1291 


2275 


3259 


787CIP2 314 


10351 


308 


1292 


2276 


3260 


787CIP2 315 


10352 


309 


1293 


2277 


3261 


787CIP2_316 


10353 


310 


1294 


2278 


3262 


787CIP2_317 


10354 


311 


1295 


2279 


3263 


787CIP2 318 


10355 


312 


1296 


2280 


3264 


787CIP2 319 


10356 


313 


1297 


2281 


3265 


787CIP2_320 


10357 


314 


1298 


2282 


3266 


787CIP2 321 


10358 


315 


1299 


2283 


3267 


787CIP2 322 


10360 


316 


1300 


2284 


3268 


787CIP2 323 


10361 


317 


1301 


2285 


3269 


787CIP2 324 


10362 


318 


1302 


2286 


3270 


787CIP2 325 


10363 


319 


1303 


2287 


3271 


787CIP2 326 


10365 


320 


1304 


2288 


3272 


787CIP2 327 


10366 


321 


1305 


2289 


3273 


787CIP2 328 


10367 


322 


1306 


2290 


3274 


787CIP2 329 


10369 


323 


1307 


2291 


3275 


787CIP2 330 


10370 


324 


1308 


2292 


3276 


787CIP2 331 


10371 


325 


1309 


2293 


3277 


787CIP2_332 


10372 


326 


1310 


2294 


3278 


787CIP2 333 


10373 


327 


1311 


2295 


3279 


787CIP2 334 


10375 


328 


1312 


2296 


3280 


787CIP2 335 


10377 


329 


1313 


2297 


3281 


787CIP2 336 


10379 


330 


1314 


2298 


3282 


787CIP2 337 


10381 


331 


1315 


2299 


3283 


787CIP2 338 


10382 


332 


1316 


2300 


3284 


787CIP2 339 


10383 


333 


1317 


2301 


3285 


787CIP2 340 


10384 


-334 


1318 


2302 


3286 


787CIP2_341 


10385 


335 


1319 


2303 


3287 


787CIP2_342 


10386 


336 


1320 


2304 


3288 


787CIP2_343 


10387 


337 


1321 


2305 . 


3289 


787CIP2_^346 


10391 


338 


1322 


2306 


3290 


787CIP2 348 


10393 


339 


1323 


2307 


3291 


787CIP2 349 


10394 


340 


1324 


2308 


3292 


787CIP2 350 


10395 


341 


1325 


2309 


3293 


787CIP2 351 


10396 


342 


1326 


2310 


3294 


787CIP2 352 


10397 


343 


1327 


2311 


3295 


787CIP2_353 


10399 


344 


1328 


2312 


3296 


787CIP2 354 


10400 


345 


1329 


2313 


3297 


787CIP2 355 


10401 


346 


1330 


2314 


3298 


787CIP2 357 


10403 


347 


1331 


2315 


3299 


787CIP2 358 


10404 


348 


1332 


2316 


3300 


787CIP2 359 


10407 


349 


1333 


2317 


3301 


787CIP2 360 


10408 


350 


1334 


2318 


3302 


787CIP2 361 


10409 


351 


1335 


2319 


3303 


787CIP2 362 


10410 


352 


1336 


2320 


3304 


787CIP2B 1 


44 


353 


1337 . 


2321 


3305 


787CIP2B_2 


50 
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354 


1338 


2322 


3306 


787CIP2B 3 


93 


355 


1339 


2323 


3307 


787CIP2B_4 


224 


356 


1340 


2324 


3308 


787CIP2B 5 


318 


357 


1341 


2325 


3309 


787CIP2B 6 


318 


358 . 


1342 


2326 


3310 


787CIP2B 7 


795 


359 


1343 


2327 


3311 


787CIP2B 8 


857 


360 


1344 


2328 


3312 


787C1P2B 9 


924 


361 


1345 


2329 


3313 


787CIP2B 10 


944 


362 


1346 


2330 


3314 


787CIP2B 11 


944 


363 


1347 


2331 


3315 


787CIP2B 12 


967 


364 


1348 


2332 


3316 


787CIP2B_13 


1055 


365 


1349 


2333 


3317 


787CIP2B_14 


1091 


366 


1350 


2334 


3318 


787CIP2B 15 


1225 


367 


1351 


2335 


3319 


787CIP2B 16 


1257 


368 


1352 


2336 


3320 


787CIP2B 17 


1289 


369 


1353 


2337 


3321 


787CIP2B_18 


1292 


370 


1354 


2338 


3322 


787CIP2B_19 


1455 


371 


1355 


2339 


3323 


787CIP2B_20 


1488 


372 


1356 


2340 


3324 


787CIP2B 21 


1666 


373 


1357 


2341 


3325 


787CIP2B 22 


1811 


374 


1358 


2342 


3326 


787CIP2B 23 


1885 


375 


1359 


2343 


3327 


787CP2B 24 


1911 


376 


1360 


2344 


3328 


787C1P2B_25 


1935 


377 


1361 


2345 


3329 


787CIP2B_26 


1971 


378 


1362 


2346 


3330 


787CIP2B 27 


1989 


379 


1363 


2347 


3331 


787CIP2B_28 


2041 


380 


1364 


2348 


3332 


787CIP2B 29 


2178 


381 


1365 


2349 


3333 


787CIP2B 30 


2237 


382 


1366 


2350 


3334 


787CIP2B_31 


2279 


383 


1367 


2351 


3335 


787CIP2B 32 


2338 


384 


1368 


2352 


3336 


787CIP2B__33 


2351 


385 


1369 


2353 


3337 


787CIP2B_34 


2405 


386 


1370 


2354 


3338 


787CIP2B 35 


2531 


387. 


1371 


2355 


3339 


787CIP2B 36 


2584 


388 


1372 


2356 


3340 


787CIP2B 37 


2608 


389 


1373 


2357 


3341 


787CP2B 38 


2655 


390 


1374 


2358 


3342 


787CIP2B 39 


2656 


391 


1375 


2359 


3343- 


787CIP2B 40 


2866 


392 


1376 


2360 


3344 ^ 


787CP2B_41 


3015 


393 


1377 


2361 


3345 


787CIP2B„42 


3015 


394 


1378 


2362 


3346 


787CIP2B_43 


3043 


395 


1379 


2363 


3347 


787CIP2B 44 


3986 


396 


1380 


2364 


3348 


787CIP2B 45 


4647 


397 


1381 


2365 


3349 


787C1P2B 46 


4659 


398 


1382 


2366 


3350 


787CIP2B 47 


5032 


399 


1383 


2367 


3351 


787CIP2B_48 


5244 


400 


1384 


2368 


3352 


787CIP2B 49 


5268 


401 


1385 


2369* 


3353 


787CIP2B 50 


5281 


402 


1386 


2370 


3354 


787CIP2B 51 


5282 


403 


1387 


2371 


3355 


787CIP2B_52 


6147 


404 


1388 


2372 


3356. 


787CIP2B 53 


6178 


405 


1389 


2373 


3357 


787CIP2B__54 


6184 


406 


1390 


2374 


3358 


787CIP2B_55 


6187 


407 


1391 


2375 


3359 


787CIP2B_56 


6190 


408 


1392 


2376 


3360 


787CIP2B_57 


6191 


409 


1393 


2377 


3361 


787CIP2B_^58 


6194 


410 


1394 


2378 


3362 


787CIP2B_59 


6196 


411 


1395 


2379 


3363 


787CIP2B_60 


6201 


412 


1396 


2380 


3364 


787CIP2B_61 


6208 


413 


1397 


2381 


3365 


787CIP2B_62 


6214 
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414 


1398 


2382 


3366 


787CIP2B_63 


6217 


415 


1399 


2383 


3367 


787CIP2B_.64 


6220 


416 


1400 


2384 


3368 


787CIP2B_^65 


6221 


417 


1401 


2385 


3369 


787CIP2B_66 


6222 


418 


1402 


2386 


3370 


787CIP2B_.67 


6223 


419 


1403 


2387 


3371 


787CIP2B_68 


6223 


420 


1404 


2388 


3372 


787CIP2B_69 


6226 


421 


1405 


2389 


3373 


787CIP2B_70 


6227 


422 


1406 


2390 


3374 


787CIP2B_71 


6229 


423 


1407 


2391 


3375 


787CIP2B_72 


6248 


424 


1408 


2392 


3376 


787CIP2B_73 


6260 


425 


1409 


2393 


3377 


787CIP2B_74 


6264 


426 


1410 


2394 


3378 


787CIP2B_75 


6269 


427 


1411 


2395 


3379 


787CIP2B_76 


6269 


428 


1412 


2396 


3380 


787CIP2B 77 


6275 


429 


1413 


2397 


3381 


787CIP2B_78 


6276 


430 


1414 


2398 


3382 


787CIP2B_79 


6280 


431 


1415 


2399 


3383 


787CIP2B 80 


6287 


432 


1416 


2400 


3384 


787CIP2B_81 


6290 


433 


1417 


2401 


3385 


787CIP2B 82 


6293 


434 


1418 


2402 


3386 


787CIP2B 83 


6305 


435 • 


1419 


2403 


3387 


787CIP2B 84 


6308 


436 


1420 


2404 


3388 


787CIP2B_85 


6309 


437 


1421 


2405 


3389 


787CIP2B_86 


6312 


438 


1422 


2406 


3390 


787CIP2B 87 


6314 


439 


1423 


2407 


3391 


787CIP2B_88 


6316 


440 


1424 


2408 


3392 


787CIP2B_89 


6336 


441 


1425 


2409 


3393 


787CIP2B 90 


6341 


442 


1426 


2410 


3394 


787CIP2B 91 


6343 


443 


1427 


2411 


3395 


787CIP2B 92 


6346 


444 


1428 


2412 


3396 


787CIP2B_93 


6357 


445 


1429 


2413 


3397 


787CIP2B_94 


6359 


446 


1430 


2414 


3398 


787CIP2B_95 


6367 


447 


1431 


2415 


3399 


787CIP2B_96 


6383 


448 


1432 


2416 


3400 


787CIP2B 97 


6385 


449 


1433 


2417 


3401 


787CIP2B_98 


6396 


450 


1434 


2418 


3402 


787CIP2B 99 


6396 


451 


1435 


2419 


3403 


787CIP2B 100 


6403 


452 


1436 


2420 


3404 


787CIP2B 101 


6405 


453 


1437 


2421 


3405 


787CIP2B 102 


6414 


454 


1438 


2422 


3406 


787CIP2B 103 


6418 


455 


1439 


2423 


3407 


787CIP2B 104 


6422 


456 


1440 


2424 


3408 


787CIP2B_105 


6425 


457 


1441 


2425 


3409 


787CIP2B_106 


6436 


458 


1442 


2426 


3410 


787CIP2B_107 


6471 


459 


1443 


2427 


3411 


787CIP2B 108 


6474 


460 


1444 


2428 


3412 


787CIP2B_109 


6482 


461 


1445 


2429 


3413 


787CIP2B_110 


6504 


462 


1446 


2430 


3414 


787CIP2B 111 


6510 


463 


1447 


2431 


3415 


787C1P2B_1 12 


6515 


464 


1448 


2432 


3416 


787CIP2B_1 13 


6529 


465 


1449 


2433 


3417 


787CIP2B_114 


6535 


466 


1450 


2434 


3418 


787CIP2B 115 


6536 


467 


1451 


2435 


3419 


787CIP2B_116 


6536 


468 


1452 


2436 


3420 


787CIP2B_117 


6541 


469 


1453 


2437 


3421 , 


787CIP2B 118 


6542 


470 


1454 


2438 


3422 


787CIP2B 119 


6547 


471 


1455 


2439 


3423 


787CIP2BJ20 


6548 


472 


1456 


2440 


3424 


787CIP2B_121 


6552 


473 


1457 


2441 


3425 


787CIP2B 122 


6552 
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474 


1458 


2442 


3426 


787CIP2B 123 


6555 


475 


1459 


2443 


3427 


787CIP2B 124 


6560 


476 


1460 


2444 


3428 


787CIP2B 125 


6566 


477 


1461 


2445 


3429 


787CIP2B 126 


6576 


478 


1462 


2446 


3430 


787CIP2BJ27 


6584 


479 


1463 


2447 


3431 


787CIP2B_128 


6588 


480 


1464 


2448 


3432 


787CIP2B 129 


6589 


481 


1465 


2449 


3433 


787CIP2B 130 


6590 


482 


1466 


2450 


3434 


787CIP2B 131 


6597 


483 


1467 


2451 


3435 


787CIP2B 132 


6600 


484 


1468 


2452 


3436 


787CIP2B 133 


6602 


485 


1469 


2453 


3437 


787CIP2B_134 


6604 


486 


1470 


2454 


3438 


787CIP2B_135 


6605 


487 


1471 


2455 


3439 


787CIP2B 136 


6608 


488 


1472 


2456 


3440 


787CIP2B 137 


6610 


489 


1473 


2457 


3441 


787CIP2B 138 


6614 


490 


1474 


2458 


3442 


787CIP2B 139 


6623 


491 


1475 


2459 


3443 


787CIP2B 140 


6629 


492 


1476 


2460 


3444 


787CIP2B_141 


6631 


493 


J 477 


2461 


3445 


787CIP2B_142 


6631 


494 


1478 


2462 


3446 


787CIP2B 143 


6631 


495 


1479 


2463 


3447 


787CIP2B 144 


6632 


496 


1480 


2464 


3448 


787CIP2B 145 


6633 


497 


1481 


2465 


3449 


787CIP2B 146 


6634 


498 


1482 


2466 


3450 


787CIP2B_147 


6635 


499 


1483 


2467 


3451 


787CIP2B 148 


6639 


500 


1484 


2468 


3452 


787CIP2B 149 


6649 


501 


1485 


2469 


3453 


787C1P2B 150 


6651 


502 


1486 


2470 


3454 


787CIP2B 151 


6655 


503 


1487 


2471 


3455 


787CIP2B 152 


6658 


504 


1488 


2472 


3456 


787CIP2B 153 


6667 


505 


1489 


2473 


3457 


787CIP2B 154 


6672 


506 


1490 


2474 


3458 


787CIP2B 155 


6682 


507 


1491 


2475 


3459 


787CIP2B 156 


6683 


508 


1492 


2476 


3460 


787CIP2B 157 


6687 


509 


1493 


2477 


3461 


787CIP2B 158 


6687 


510 


1494 


2478 


3462 


787CIP2B 159 


6688 


511 


1495 


2479 


3463 


787CIP2B 160 


6696 


512 


1496 


2480 


3464 


787CP2B 161 


6701 


513 


1497 


2481 


3465 


787CIP2B 162 


6707 


514 


1498 


2482 


3466 


787CIP2B 163 


6712 


515 


1499 


2483 


3467 


787CIP2B 164 


6714 


516 


1500 


2484 


3468 


787CIP2B_165 


6720 


517 


1501 


2485 


3469 


787CIP2B__166 


6721 


518 


1502 


2486 


3470 


787CIP2B_167 


6722 


519 


1503 


2487 


3471 


787CIP2B 168 


6736 


520 


1504 


2488 


3472 


787CIP2B 169 


6740 


521 


1505 


2489 


3473 


787CIP2B 170 


6740 


522 


1506 


2490 


3474 


787CIP2BJ71 


6760 


523 


1507 


2491 


3475 


787CIP2B 172 


6775 


524 


1508 


2492 


3476; 


787CIP2B 173 


6784 


525 


1509 


2493 


3477 


787CIP2B_174 


6793 


526 


1510 


2494 


3478 


787CIP2B 175 


6795 . 


527 


1511 


2495 


3479 


787CJP2B_176 


6796 


528 


1512 


2496 


3480 


787CIP2B 177 


6807 


529 


1513 


2497 


3481 


787CIP2B 178 


6808 


530 


1514 


2498 


3482 


787CP2B__179 


6810 


531 


1515 


2499 


3483 . 


787CIP2B 180 


6815 


532 


1516 


2500 


3484 


787CIP2B 181 


6819 


533 


1517 


2501 


3485 


787CIP2B 182 


6821 
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534 


1518 


2502 


3486 


787CIP2B_183 


6827 


535 


1519 


2503 


3487 


787CIP2B 184 


6829 


536 


1520 


2504 


3488 


787CIP2B_185 


6830 


537 


1521 


2505 


3489 


787CIP2B 186 


6835 


538 


1522 


2506 


3490 


787CIP2B 187 
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7016 
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659 


1643 


2627 


3611 
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660 
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8284 


661 
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8285 


662 
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8304 
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665 
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8331 


666 
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678 
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679 
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2647 
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680 
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2648 
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8444 


681 
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8446 


682 
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2650 
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683 
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8478 


684 
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8490 


685 


1669 
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8505 


686 
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8523 
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688 


1672 
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689 
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690 
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1675 


2659 
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8537 


692 
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693 
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2661 
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8546 


694 
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2662 
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8553 


695 
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2663 
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8556 


696 
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2664 
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697 
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2665 
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8562 


698 
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2666 


3650 
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8569 


699 
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2667 
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8587 


700 
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8597 


701 
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711 
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3663 


787CIP2B 364 


8643 


712 


1696 


2680 


3664 


787CIP2B 365 


8644 


713 


1697 


2681 


3665 


787CIP2B 366 


8645 



208 



wo 01/57190 



PCTAJSOl/04098 



714 


1698 


2682 


3666 


787CIP2B 367 


8646 


715 


1699 


2683 
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8657 


716 


1700 
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8661 


717 
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8670 


718 
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8698 


720 
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722 
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3675 


787CIP2B 376 


8799 
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8806 


725 
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8809 


726 
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8814 


727 
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3679 
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728 


1712 
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730 
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8877 


731 
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2699 
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8886 


732 
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9003 
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9157 


734 
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9175 


735 
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9205 
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737 
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9307 


739 


1723 


2707 
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2711 
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9370 


744 


1728 
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9382 


745 
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746 


1730 
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2716 
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9715 


750 
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9755 


751 
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752 
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3704 


787CIP2B 405 


9771 
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754 
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755 


1739 
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757 


1741 
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758 
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10010 


759 


1743 
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760 
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10043 


761 
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763 
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767 
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886 


768 
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769 


1753 


2737 
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1916 


770 
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771 
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774 


1758 


2742 
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2887 
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3001 


lie 
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3182 


111 


1761 
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lis 
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3182 


779 
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780 


1764 
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3196 


781 
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3733 


787CIP2C_15 


3224 


782 
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783 
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792 
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797 
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800 
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2768 


3752 
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3441 


801 
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2769 


3753 


787CIP2C 35 


3479 


802 
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803 
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804 
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805 
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3757 


787CIP2C_39 
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806 
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807 
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808 
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809 
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810 
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4218 


813 
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4219 


814 


1798 


2782 


3766 


787CIP2C 48 


4222 


815 


1799 


2783 


3767 


787CIP2C 49 


4222 


816 


1800 


2784 


3768 


787CIP2C_50 


4229 


817 


1801 


2785 


3769 


787CIP2C__51 


4230 


818 


1802 


2786 


3770 


787CIP2C_52 


4240 


819 


1803 


2787 


3771 


787CIP2C 53 


4241 


820 


1804 


2788 


3772 


787CIP2C_54 


4249 


821 


1805 


2789 


3773 


787CIP2C_55 


4252 


822 


1806 


2790 


3774 


787CIP2C_56 


4267 


823 


1807 


2791 


3775 


787CIP2C 57 


4272 


824 


1808 


2792 


3776 


787CIP2C_58 


4273 


825 


1809 


2793 


3777 


787CIP2C^59 


4275 


826 


1810 


2794 


3778 


787CIP2C_60 


4283 


827 


18U 


2795 


3779 


787CIP2C 61 


4290 


828 


1812 


2796 


3780 


787CIP2C_62 


4292 


829 


1813 


2797 


3781 


787CIP2C_^63 


4305 


830 


1814 


2798 


3782 


787CIP2C 64 


4306 


831 


1815 


2799 


3783 


787CIP2C 65 


4308 


832 


1816 


2800 


3784 


787CIP2C 66 


4322 


833 


1817 


2801 


3785 


787CIP2C 67 


4351 
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834 


1818 


2802 


3786 


787CIP2C 68 


4356 


835 


1819 


2803 


3787 


787CIP2C 69 


4399 


836 


1820 


2804 


3788 


787CIP2C 70 


4400 


837 


1821 


2805 


3789 


787C1P2C 71 


4520 


838 


1822 


2806 


3790 


787CIP2C 72 


4598 


839 


1823 


2807 


3791 


787CIP2C 73 


4599 


840 


1824 


2808 


3792 


787CIP2C_74 


4600 


841 


1825 


2809 


3793 


787CIP2C 75 


4670 


842 


1826 


2810 


3794 


787CIP2C 76 


4708 


843 


1827 


2811 


3795 


787CIP2C 77 


4734 


844 


1828 


2812 


3796 


787CIP2C_^78 


4738 


845 


1829 


2813 


3797 


787CIP2C 79 


4749 


846 


1830 


2814 


3798 


787C1P2C 80 


4752 


847 


1831 


2815 


3799 


787CIP2C 81 


4752 


848 


1832 


2816 


3800 


787CIP2C 82 


4770 


849 


1833 


2817 • 


3801 


787CIP2CJ3 


4784 


850 


1834 


2818 


3802 


787CIP2C 84 


4785 


851 


1835 


2819 


3803 


787CIP2C 85 


4792 


852 


1836 


2820 


3804 


787CIP2C 86 


4803 


853 


1837 


2821 


3805 


787CIP2C_87 


4811 


854 


1838 


2822 


3806 


787CIP2C 88 


4817 


855 


1839 


2823 


3807 


787CIP2C_89 


4818 


856 


1840 


2824 


3808 


787CIP2C 90 


4820 


857 


1841 


2825 


3809 


787CIP2C 91 


4831 


858 


1842 


2826 


3810 


787CIP2C_92 


4841 


859 


1843 


2827 


3811 


787CIP2C 93 


4869 


860 


1844 


2828 


3812 


787CIP2C 94 


4876 


861 


1845 


2829 


3813 


787CIP2C 95 


4902 


862 


1846 


2830 


3814 


787CIP2C 96 


4910 


863 


1847 


2831 


3815 


787CIP2C 97 


4931 


864 


1848 


2832 


3816 


787GIP2C 98 


5303 


865 


1849 


2833 


3817 


787CIP2C 99 


5317 


866 


1850 


2834 


3818 


787CIP2C 100 


5322 


867 


1851 


2835 


3819 


787CIP2C 101 


5330 


868 


1852 


2836 


3820 


787CIP2C 102 


5333 


869 


1853 


2837 


3821 


787CIP2C 103 


5333 


870 


1854 


2838 


3822 


787CIP2C 104 


5356 


871 


1855 


2839 


3823 


787CIP2C 105 


5363 


872 


1856 


2840 


3824 


787CIP2C 106 


5364 


873 


1857 


2841 


3825 


787CIP2C 107 


5379 


874 


1S58 


2842 


3826 


787CIP2C 108 


5386 


875 


1859 


2843 


3827 


787CIP2C_109 


5397 


876 


1860 


2844 


3828 


787CIP2C 110 


5401 


877 


1861 


2845 


3829 


787CIP2C 111 


5419 


878 


1862 


2846 


3830 


787CIP2C_1 12 


5420 


879 


1863 


2847 


3831 


787CIP2C 113 


5452 


880 


1864 


2848 


3832 


787CIP2C 114 


5467 


881 


1865 


2849 


3833 


787CIP2C 115 


5482 


882 


1866 


2850 


3834 


787CIP2C 116 


5483 


883 


1867 


2851 


3835 


787CIP2C 117 


5492 


884 


1868 


2852 


3836. 


787CIP2C 118 


5499 


885 


1869 


2853 


3837 


787CIP2C 119 


5525 


886 


1870 


2854 


3838 


787CIP2C 120 


5538 


887 


1871 


2855 


3839 


787CIP2C 121 


5539 


888 


1872 


2856 


3840 


787CIP2C 122 


5558 


889 


1873 


2857 


3841 


787CIP2C 123 


5559 


890 


1874 


2858 


3842 


787CIP2C_124 


5586 


891 


1875 


2859 


3843 


787CIP2C_125 


5619 


892 


1876 


2860 


3844 


787CIP2C_126 


5628 


893 


1877 


2861 


3845 


787CIP2C 127 


5640 
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894 


1878 


2862 


3846 


787C1P2C 128 


5640 


895 


1879 


2863 


3847 


787CIP2C_129 


5827 


896 


1880 


.2864 


3848 


787CIP2C_130 


6094 


897 


1881 


2865 


3849 


787CIP2C_131 


6195 


898 


1882 


2866 


3850 


787CIP2C_132 


6206 


899 


1883 


2867 


3851 


787CIP2C 133 


6355 


900 


1884 


2868 


3852 


787CIP2C 134 


6362 


901 


1885 


2869 


3853 


787CIP2C_135 


6386 


902 


1886 


2870 


3854 


787CIP2C 136 


6431 


903 


1887 


2871 


3855 


787CIP2C 137 


6457 


904 


18S8 


2872 


3856 


787CIP2C 138 


6480 


905 


1889 


2873 


3857 


787CIP2C_139 


6497 


906 


1890 


2874 


3858 


787CIP2C 140 


6532 


907 


1891 


2875 


3859 


787CIP2C 141 


6598 


908 


1892 


2876 


3860 


787CIP2C 142 


6644 


909 


1893 


2877 


3861 


787CIP2C_143 


6644 


910 


1894 


2878 


3862 


787CIP2C_144 


6645 


911 


1895 


2879 


3863 


787CIP2C 145 


6645 


912 


1896 


2880 


3864 


787CIP2C 146 


6761 


913 


1897 


2881 


3865 


787CIP2C 147 


6782 


914 


1898 


2882 


3866 


787CIP2C 148 


6981 


915 


1899 


2883 


3867 


787CIP2C 149 


6981 


916 


1900 


2884 


3868 


787CIP2C_150 


7000 


917 


1901 


2885 


3869 


787Cn>2C 151 


7029 


918 


1902 


2886 


3870 


787Cff2C 152 


7885 


919 


1903 


2887 


3871 


787CIP2C 153 


8143 


920 


1904 


2888 


3872 


787CIP2C 154 


8143 


921 


1905 


2889 


3873 


787CIP2C 155 


8234 


922 


1906 


2890 


3874 


787CIP2C_156 


8463 


923 


1907 


2891- 


3875 


787CIP2C 157 


8467 


924 


1908 


2892 


3876 


787CIP2C_158 


8540 


925 


1909 


2893 


3877 


787CIP2C 159 


8600 


926 


1910 


2894 


3878 


787CIP2C_160 


9656 


927 


1911 


2895 


3879 


787CIP2C 161 


9669 


928 


1912 


2896 


3880 


787CIP2C 162 


9695 


929 


1913 


2897 


3881 


787Cff2C 163 


9744 


930 


1914 


2898 


3882 


787CIP2C_164 


9849 


931 


1915 


2899 


3883 


787CIP2D 1 


4180 


932 


1916 


2900 


3884 


787CIP2D 2 


4181 


933 


1917 


2901 


3885 


787CIP2D_3 


4314 


934 


1918 


2902 


3886 


787CIP2D 4 


4500 


935 


1919 


2903 


3887 


787CIP2D 5 


5651 


936 


1920 


2904 


3888 


787CIP2D 6 


5691 


937 


1921 


2905 


3889 


787CIP2D 7 


5881 


938 


1922 


2906 


3890 


787CIP2D 8 


5882 


939 


1923 


2907 


3891 


787CIP2D_9 


6209 


940 


1924 


2908 


3892 


787OT2D 10 


6719 


941 


1925 


2909 


3893 


787CIP2D 11 


8130 


942 


1926 


2910 


3894 


787CIP2D_12 


8863 


943 


1927 


2911 


3895 


787CIP2D_13 


8902 


944 


1928 


2912 


3896 


787CIP2D 14 


9162 


945 


1929 


2913 


3897 


787CIP2D_15 


9197 


946 


1930 


2914 


3898 


787CIP2D_16 


9215 


947 


1931 


2915 


3899 


787CIP2D_17 


9232 


948 


1932 


2916 


3900 


787Cff2D_18 


9262 


949 


1933 


2917 


3901 


787CIP2D_19 


9369 


950 


1934 


2918 


3902 


787CIP2D_20 


9371 


951 


1935 


2919 


3903 


787CIP2D_21 


9516 


952 


1936 


2920 


3904 


787CIP2D_22 


9601 


953 


1937 


2921 


3905 


787CIP2D_23 


9731 
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954 


1938 


2922 


3906 


787CIP2D_24 


9733 


955 


1939 


2923 


3907 


787CP2D_25 


9769 


956 


1940 - 


2924 


3908 


787CIP2D__26 


9804 


957 


1941 


2925 


3909 


787CIP2D_27 


9816 


958 


1942 


2926 


3910 


787CIP2D_28 


9844 


959 


1943 


2927 


3911 


787CIP2D_29 


9924 


960 


1944 


2928 


3912 


787CIP2D 30 


9936 


961 


1945 


2929 


3913 


787CIP2D 31 


10163 


962 


1946 


2930 


3914 


787CIP2D_32 


10165 


963 


1947 


2931 


3915 


787CIP2D 33 


10165 


964 


1948 


2932 


3916 


787CIP2D 34 


10244 


965 


1949 


2933 


3917 


787CIP2D_35 


10278 


966 


1950 


2934 


3918 


787CIP2E 1 


4251 


967 


1951 


2935 


3919 


787CIP2E 2 


5310 


968 


1952 


2936 


3920 


787CIP2E 3 


5697 


969 


1953 


2937 


3921 


787CIP2E^4 


5731 


970 


1954 


2938 


3922 


787CIP2E_^5 


5733 


971 


1955 


2939 


3923 


787C1P2E 6 


5734 


972 


1956 


2940 


3924 


787CIP2E 7 


5740 


973 


1957 


2941 


3925. 


787CIP2E 8 


7657 


974 


1958 


2942 


3926 


787CIP2E 9 


9572 


975 


1959 


2943 


3927 


787CIP2F_1 


1363 


976 


1960 


2944 


3928 


787CIP2F_2 


4303 


977 


1961 


2945 


3929 


787CIP2F 3 


5760 


978 


1962 


2946 


3930 


787CIP2F_4 


5766 


979 


1963 


2947 


3931 


787CIP2F_5 


5767 


980 


1964 


2948 


3932 


787CP2F_6 


5767 


981 


1965 


2949 


3933 


787CIP2F__7 


5770 


982 


1966 


2950 


3934 


787CIP2F 8 


6855 


983 


1967 


2951 


3935 


787CIP2F 9 


10026 


984 


1968 


2952 


3936 


787CIP2F 10 


10227 



TABLE 6 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'=Alanine C'=Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F=Phenylalamne, G=Glycine, H^'Histldine, 
I^Isoleucine, K=Lysine, L==Leucine, M^Methionlne, 
N^Asparagine, P=Prolinc, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\ppossible nucleotide insertion 


2953 


A 


3 


324 


ISEHRIEASGNYLAQRLTSSFLRGLSSWKSNPLML 
CGWTILLTLTMVQGEP*GP\KGIPG\FHTNSSYPH 
WGTVAKPPAGD*DLLPAPGQEGTPLFTR*SLCTY 
CPID 


2954 


A 


18 


467 


REELGKDLFDCTLYVLLKYDDFNADKHLALEEF 

YRAFQVIQLSLPEDQKJLSITAATVGQSAVLSCAIQ 

GTLia>PnWKRNNIILNNLDLEDINDFGDDGSLYIT 

KVTTTHVGNYTCYADGYEQWQTfflFQVNVPPV 

IRVYPESQARRAG 


2955 


A 


3 


23 


FYSAFLVADKGIVTSKHNNDTQHIWESDSNEFSV 
IADPRGNTLGRGTnT*VSIPPSL 


2956 


A 


1 


493 


RTKTDVYILNLAVADLLLLFTLPFWAVNAVHGW 

VLGKIMCKITSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWIICFCVWMAAILLSIPQL 

WYTVNDNARCIPIFPRYLGTSMKALIQMLEICIG 

FVVPFLIMGVCYFITARTLMKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAVIL 
KLFQNKVLNILKNFFLSPLDTRKNKVFKKWAGG 
PGAVAHACNPSTLGGRGGRITKSGDRDHPGQHG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amioo 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
pepuoe 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutaraic Acid, F==Phenylalanine, G=GIyclne, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leacine, M=Methion!ne, 
N=Asparagine, P=ProIine, Q=Glutaraine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y»Tyrosinc, 
X=lln known, *=Stop cod on, /='possible nucleotide deletion, 
\=possible nucleotide insertion 










ETRSLPACWAQWKSLALPVSRAPGRQGSLWEP 
LP 


2958 


A 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKCLD 
NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 
KKGKTCGFKRGTETRVREnQHPSAKGNLCPPTN 
ETRKCTVQIUKKCQKGERGKKGRERKRKKPNKG 
ESKEAIPDSKSLESSKEIPEQRENKQQQ 


2959 


A 


1 


426 


LSMLSTISTEHRLSVLWPIWYCCHCPTHLSAVMC 
VLLWALSLLQSE.EWMFCSFLFSDVDSDNWCQIL 
DFLTAVWLIFLI\LVLCGFTLVLLVRnCGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 
DL 


2960 


A 


1194 


852 


EKRKTSYSQCLNSKQRNVSMRPSIWIHVHLKPPC 
RLVELLPFSSALQGLSHLSLGTTLPA'*GHLRFRL 
RNLPQSLRTVILPEKNEEQNLQELSHNADKYQM 
GDCCKEEBDDSIFY 


2961 


A 


274 


2250 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 

SLTPPTSVRRMPLITTVTLLKMVARHHMKLLCSK 

AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHII 

SILMGQPMALVQLETLAPLTIIIQKFQTQDHMKF 

WKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITK 

T1QNGRELFESSLCGDLLNEVQASE\Q*NQSIESRK 

EKRKKSNKHDSSRSEERKSHKIPKLEPEEQNRPN 

ERVDTVSEKPREEPVLKEGSPSSANTIFCSNNGSV 

HWKFQVGDLVWSKVGTYPWWPCMVSSDPQL 

EVHTKINTRGAREYHVQFFSNQPERAWVHEKRV 

REYKGHKQYEELLAEATKQASNHSEKQBCIRKPR 

PQRERAQWDIGIAHAEKALKMTREERIEQYTFIYI 

DKQPEEALSQAKKSVASKTEVKKTRRPRSVLNT 

QPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEP 

PPVKIAWKTAAARKSLPASITMHKGSLDLQKCN 

MSPVVKffiQVFALQNATGDGKFIDQFVYSTKGIG 

NKTEISVRGQDRLHSTPNQRMEKPTQSVSSPEATS 

GSTGSVEKKQQRRSIRTRSESEKSTEVWKKKIK 

KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 

SSVSAAIEETVD 


2962 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVLAQNVGTTHDLLDICLKRATV 

QRAQHVFQHAVPQEGKPITNQKSSGRCWffSCLN 

VMRLPFMKKLNIEEFEFSQSYLFFWDKVERCYFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYTTEATRRMND 

ILNHKMREFCIRLKNLVHSGATKGEISATQDVM 

MEEIFRVVCICLGNPPETFTWEYRDKDKNNKKIG 

PMTPLEFNR/EQHVKPLFNMEDKICLVNDPRPQH 

KYNKLYTV\EYL\SNMVWRGEKLFYNNQPIDFLK 

KMVAASIKDG\EAVWFGCDVGKHF\NSKLGVLSD 

MNLYDHELVFGVSLKNMNKAER\LTFGES\LMT 

HTMTFTAV/SQSRDDSGMVLFTKW\RVGEFQWG 

EDHGH\KGYLCMTD*VGSLEYVYEVVA^RKH 

VP\EEVLAVLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543 


RHYDSAGKin-KIAKNYLEQRAVGGASPRLAQS 

VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 

DRMKTTIKETST*LSNSYLVFPLM*SLTYLMKMS 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

uciu rcaiuuc ui 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

sequence 


Amino acid sequence (A=Alanine C^Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H»Histidine, 
I»Isoleucine, K==Lysine, L=Leucine, M-Methionine, 
NsAsparagine, P«Prolinc, Q=GIutamlne, R=Arginine, S»Serine, 
T^Threonine, V=VaIine, W^Tryptophan, y=Tyrosine, 
X^Unknown, *=Stop codon,/==possible nucleotide deletion, 

\=nficcShlA nitf^lAnfiitA ineArfinn 

t — pu&siDie nucicunue loscnion 










FERCTAR>nCMFVNSPFTKVDNyCT\SS\WKKFYL 
KCYFSLNTIKKEKKMT 


2964 


A 


3 


2454 


FDTYRQLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYAWCPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKDVP/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVT\VF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFFHMLKLAVNVPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 

HRGAIYGSSW 


2965 


A 


3 


2454 


FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYS>fKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKJLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKDVP/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVT\VF 

nvnaplpprkeqeikespyspgynqsfrrastqtp 

pqcqlpsihveqtvhsqetanyhpdgtiqvsngs 

lafypaqtnvfprptqpfvnsrgsvrgctrggrl 

itnsyrspggykgfdtyrglpsisngnysqlqfq 

areysgapysqrdnfqqcykrggtsggpransr 

agwsdssqvssperdnetfnsgdsgqgdsrsmt 

pvdvpvtnpaatilpvhvyplpqqmrvafsaar 

tsnlapgtldqpivfdlllnnlgetfdlqlgrfn 

CTVNGTYVFIFHMIiaAVNVPLYVhlLNlKNEEVL 

vsayandgapdhetasnhailqlfqgdqiwlrl 

HRGAIYGSSW 


2966 


A 


1693 


227 


dyvltaelhrqrspgvsfglsvfnlmnaimgsgi 

lglayvmantgvfgfsfllltvallasysvhll 

lsmciqtaylgp*tnyfmvlpah*ltclplieflq 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=<:iutaroic Acid, F==Phenylalanine, G'^GIycine, lIsHistidine» 
1=Isoleucine, K=Lysine, L«=Lcucine, M=Methlonine, 
N=>Asparagine, F=Prolinc, Q^Glutamine, R^Arginine, S°Serine, 
T^Threonine, V«VaIine, W"Tryptophan, Y=Tyrosine, 
X»llnknown, *=Stop codoo, A=possible nucleotide deletion, * 
V=possible nucleotide insertion 










SL*NSL\*AVTSYEDLGLFAFGLPGKLVVAGTIIIQ 

NIGAMSSYLLIIKTELPAAIAEFLTGDYSRYWYLD 

GQTLLinCVGIVFPLALLPKIGFLGYTSSLSFFFM 

MFFALVVIIKKWSIPCPLTLNYVEKGFQISNVTDD 

CKPKLFHFSKESAYALPTMAFSFLCHTSELPIYCE 

LQSPSKKRMQNVTNTAIALSFLIYFISALFGYLTF . 

YD/GTTKAQRGEVTCHRIKDKVESELLKG***IP* 

SHDVVVMT\VKLCILFAVLL\TVPLIHFPARKAVT 

MMFFSNFPFSWnUffLITLAL>niIVLLAIYVPDIRN 

VFGWGASTSTCLIFIFPGLFYLKLSREDFLSWKK 

LGVGCFC/LLSFKTSILRNSLSVYIILPASRKSIYFK 

I 


2967 


A 


3 


3222 


SGIWRALWREKKPGGGRRVKRRNPGRQAVGH 

TEEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT 

TEECLAYFGVSETTGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVIEQFEDLLVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLILIANAIVGVWQERN 

AENAIEALKEYEPEMGKVYRADRKSVQRIKARD 

IVPGDIVEVAVGDKVPADIRILAIKSTTLRVDQSIL 

TGEYVSVIKHTEPVPDPRAVNQDKKNMLFSGTNI 

AAGKALGIVATTGVGTEIGKIRDQMAATEQDKT 

PLQQKLDEFGEQLSKVISLICVAVWLINIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAVIT 

TCLALGTRRMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MNVFNTDVRSLSKVERANACNSVIRQLMKKEFT 

LEFSRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEGVroRCN YVRVGTTRVPLTGP VKEKIMAVIKE . 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARF 

LEYETDLTFVGVVGMLDPPRKEVTGSIQLCRDA 

GIRVIMTGDNKGTAIAICRRIGIFGENEEVADRA 

Y\TGREFDDL\PLAEQ\REACRRACCFARVEPSHK 

SKIVEYLQSYDEITAMTGDGVNDAPALKKAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR 

AIYNNMKQFIRYLISSNVGEVVCIFLTAALGLPEA 

LIPVQLLWVNLVTDGLPATALGFNPPDLDIMDRP 

PRSPKEPLASGWLFFRYMAIGGYVGAATVGAAA 

WWFLYAEDGPHVNYSQLTHFMQCTEDNTHFEGI 

DCEVFEAPEPMTMALSVLVTIEMCNALNSLSEN 

QSLLRjMPPWVMWLLGSICLSMSLHFLILYVDPLP 

MIFKLRALDLTQWLMVLKISLPVIGLDEILKFVA 

R3SIYLEG*LFPLLHL*ARVTDPEDERRK 


2968 


A 


3 


2414 


GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLILQILSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSIDEKYLLHFSHY 

VNEVAPDSFKKPYLIKITSDWCFSCIHIEPVWKEV 

IQELEELGVGIGVVHAGYERRLAHHLGAHSTPSI 

LGIINGKISFFHNAVVRENLRQFVESLLPGNLVEK 

VTNKNYVRFLSGWQQENKPHVLLFDQTPIVPLL 

YKLTAFAYKDYLSFGYVYVGLRGTEEMTRRYNI 

NIYAPTLLWKEHINRPADVIQARGNIKKQIIDDFI 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
i>=Glutamic Acid, F=Pheny)alanine, G=Glycine, H-Histidine, 
Msoleudne, K'^'Lysine, L^Leucine, M-Mettiionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serinc, 
T-Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unlcnown, *=Stop codon, ^^possible nucleotide deletion, 
\=possible nudeotide insertion 










TRNKYLLAARLTSQKLFHELCPVKRSHRQRKYC 

VVLLTAETTKLSKPFEAFLSFALANTQDTVRFVH 

VYSNRQQEFADTLLPDSEAFQGKSAVSILERRNT 

AGRWYKTLEDPWIGSESDKFILLGYLDQLRKDP 

ALLSSEAVLPDLTDELAPVFLLRWFYSASDYISD 

CWDSIFHNNW\REMMPLLSLIFSALFILFGTVIVQ 

AFSDSNDERESSPPEKEEAQEKTGKTEPSFTKENS 

SKIPKKGFVEVTELTDVTYTSNLVRLRPGHMNV 

VLILSNSTKTSLLQKFALEVYTFTGSSCLHFSFLSL 

DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 

TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 

SDVDSSLYLGESRGKPSCGLGSRPIKGKLSKLSL 

WMERLLEGSLQRFYIPSWPELD 


2969 


A 


48 


1117 


KGLSPDQVLSAFAPLDCEMWLKVFTTFLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 

ASDIQIIWLFERPHTMPKYLLGSVNKSVVPDA^GI 

PA^TSSP*CHPMASLLINPLQFPDEGNYIVKVNIQG 

NGTLSASQKIQVTVDDPVTKPVVQIHPPSGAVEY 

VGNMTLTCHVEGGTRLAYQWLKNGRPVHTSST 

YSFSPQNNTLHIAPVTKEDIGNYSCLVRNPVSEM 

ESDIIMPIIYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSHPPNTYSW1RRTDNTTYIIKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 

TVIITSVGMCDIQGRDPNKT 


2970 


A 


68 


936 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYADL 

QFQNSSEMEKIPEIGKFGEKAPPAPSHVWRPAAL 

FLTLLCLLLLIGLGVLASMFHVTLKIEMKKMNKL 

QNISEELQRNISLQLMSNMNISNKIRNLSTTLQTI 

ATKLCRELYSKEQEHKCKPCPRRWIWHKDSCYF 

LSDDVQTWQESKMACAAQNASLLKINNKNALE 

FIKSQSRSYDYWLGLSPEEDS/YSWYESG*YNQ\P 

SAWVIRNAPDLNNMYCGYINRLYVQYYHCTYK 

QRMICEKMANPVQLGSTYFREA 


2971 


A 


912 


2287 


VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAPRF 

LVAFAYWNHYLSCTSPCSCYRPLCRLNFGLNVV 

ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 

GLPLPHSDLPTSWCGHSLQCGSQSSFPPAIHENAF 

IVFL\SSLGHMLLTCILWRLTKKHTVSQE\DGLSL 

AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 

RGVLGLGLGLGNKLRWGQNLGL*HCVWVVWE 

TGE*KRWRLQMGIE*GVASRRQ*VRNSVRGLVC 

HNSSAPPMYMGFFSPTVFGGGVGG*LHVTFILHP 

PEVEAAGIPLLLGPSLPQRQGREHIVVILAAPACA 

PFHDR*WEPREIRPSP*ELGLRGEPTLSYPASCRVI 

RQPIP*DRKSYSWKQRLFIINFISFFSALAVYFRHN 

NTV^CEAGVYTIFAILEYTVVLTNMAFHMTAWWD 

FGNKELLITSQPEEKRF 


2972 


A 


1734 


246 


GGILSGRDGRTALPRPREPAERTAGLRRDMRPQE 

LPRLAFPLLLLLLLLLPPPPCPAHSATRFDPTWES 

LDARQLPAWFDQAKFGIFIHWGVFSVPSFGSEWF 

WWYWQKEKIPKYVEFMKDNYPPSFKYEDFGPL 

FTAKFFNANQ\WADIFQASGAKYIVLTSKHHEGF 

TLWG\SEYSWNWNAIDEGPKRDIVKELEVAIRNR 

TDLRFGLYYSLFEWFHPLFLEDESSSFHKRQFPVS 

KTLPELYELVNNYQPEVLWSDGDGGAPDQYWN 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D^Aspartic Acid, 
E»Glutaroic Acid, F»Phenylalanine, G*»Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L^Leucine, M»Methionine, 
N^Asparagine, P-Proline, Q»GIutamine, R°Arginine, S^Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y^Tyrosinc, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 










STGFLAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYNPGHLLPHKWENCMTIDKLSWGY 

RREAGISDYLTIEELVKQLVETVSCGGNLLMNIG 

PTLDGTISVVFEERLRQMGSWLKVNGEAIYETHT 

WRSQNDTVTPDVWYTSKPKEKLVYAIFLKWPTS 

GQLFLGHPKAILGATEVKLLGHGQPLNWISLEQN 

GIMVELPQLTIHQMPCKWGWALALTNVI 


2973 


A 


24 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPR\EPPILPRIQEQFQKNPDSYNGAVRENYTW 

SQDYTDLEVRVPVPKHWKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLTHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNAILEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFNISPGA 

VQF 


2974 


A 


271 


1854 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRALLVQHESSNQMFAMKEIRLPKSFSNTQ 

NSRKEAVLLAKMKHPNIVAFKESFEAEGHLYIV 

MEYCDGGDLMQKIKQQKGKLFPEDMILNWFTQ 

MCLGVNfflHKKRVLHRDlKSKNIFLTQNGKGKL 

GDFGSARLLSNPMAFACTYVGTPYYVPPEIWEN 

LPY^mKSDIWSLGCILYELCTLKHPFQANSWKNL 

ILKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATTLLSRGIVARLVQKCLPPEIIMEYGEEVLE 

EIKNSKHNTPRKKTNPSRIRIALGNEASTVQEEEQ 

DRKGSHTDLESINENLVESALRRVNREEKGNKSV 

HLRKASSPNLHRRQWEKNVPNTALTALENASILT 

SSLTAEDDRGGSVnCYSKNTTRKQWLKETPDTLL 

NILKNADLSLAFQTYTIYRPGSVEGFLKGPLSEETE 

ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 

DNPDWVSELKKRAGWQGLCDR 


2975 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
Jocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^AIanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F»PhenyIalanine, G^lycine, H^Histidine, 
J=I$oIeucine, K^Lysine, i.»Leucine, M=iVIethionine, 
N«Asparagine, P==Proline, Q=Glutamine, R=Arginine, S==Serine, 
T=Tbrconine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletiooy 
V=possible nucleotide insertion 










RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCWRNPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGRDFNQVVRFETVNPNSTSSWFTES 

DTPQTNVTHVTQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 

RPTDNPTANSNLYILAGHENSY 


2976 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAWQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTBQ4PKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG . 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 

RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCVVRNPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGRDFNQVVRFETVNPNSTSSWFTES 

DTPQTNVTHVTQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESrVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTR[FRLLGSDRWVLES 

RPTDNPTANSNLYILAGHENSY 


2977 


A 


174 


1543 


YSLRKGITFKLAGAMVHIKKGELTQEEKELLEVI 

GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 

AAYKGKLDMCKLLLRHGADVNCHQHEHGYTA 

LMFAALSGNKDITWVMLEAGAETDVVNSVGRT 

AAQMAAFVGQHDCVTIINNFFPRERLDYYTKPQ 

GLDKEPKLPPKLAGPLHKIITTTNLHPVKIVIVILV 

NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 

NEVLAMKMHYISCIFQKCINFLKDGENKLDTLIK 

SLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 

QQLVRSIAPVEIGSDPTAFSVLTQAITGQVGFVDV 

EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 

FTHKKICKNLKDIYEKQQLEAAKEKRQEENHGK 

LDVNSNCVNEEQPEAEVGISQKDSNPEDSGEGK 

KESLESEAELEGLQDAPAGPQVSEE 


2978 


A 


3 


5177 


SDDLRTGLFQDVQDAESLKLPGVYEVLFYNETE 
DCPGMMLWRYPEPRGLTLVRITPVPFNTTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDINLVNDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C^Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G^GIycine, H=Histidine, 
I^Isoleucine, K^'Lysine, Lr=Leucine, M^Methionine, 
N«Asparagine, P^Proline, Q=Giutamine, R-Arginine, S«Serine, 
T«Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^possible nucleotide deletion, 
V=possible nucleotide insertion 










VDSCFTPWFVPSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMIVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCBCLLECRNVTMQS 

WKPFSIFGQMAVSSDVVEKLLDCTVIVDSVFVN 

LGQHWHSLNTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTDENILLASLHSHQYSWRS 

HKSPQLLHICIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQIIICGRQIICSYL 

SQSIELKWQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSIVIQVPSSNSSIIYVWCTVLTLEPNSQVQQ 

RMIVFSPLFIMRSHLPDPIIIHLEKRSLGLSETQIIP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQILDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKIVLQ 

VPAGKJIIPPNFQEAFQIGIYWANTNTVHKSVAIK 

LVHNLTSPKWKDGGNGEWTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' 

QIMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

VPLGNFRENGFCTRAIVLTYQEHLGVTYLTLSED 

PSPRVIIHNRCPVKMLIKENIKDIPKFEVYCBCKIPS 

ECSIHHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDW 

HQCGTVFITVAPEGKAGPILTNTNRAPEKIVTF/K 

MFITQLSLAVFDDLTHHKASAELLRLTLDNIFLC 

VAPGAGPLPGEEPVAALFELYCVEICCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLIS 

NKELEEYKEKCFIKLCITLNEGKSILCDINEFSFEL 

KPARLYVEDTFVYYIKTLFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKLYIASDHTPLSFSVFERGPIFITAR 

QLVHALAMHYAAGALFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTSITNLATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSDWHADQAPNSHVKYVWKMLQS 

LGRPEVHMALDVVLVRGSGQEHEGCLLLTSEVL 

FVVSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDPHFAQVFLSKFTMVKNKALRKGFP 


2979 


A 


255 


2673 


AWLFPASVLCPRCLTGSAVGSAEWKSLWLFPFS 

SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPHIG 

NYRLLKTIGKGNFAKVKLARHILTGKEVAVKIID 

KTQLNSSSLQKLFREVRIMKVLNHPNIVKLFEVIE 

TEKTLYLVMEYASGGEVFDYLVAHGRMKEKEA 

RAKFRQIVSAVQYCHQKFIVHRDLKAENLLLDA 

DMNIKIADFGFSNEFTFGNKLDTFCGSPPYAAPEL 

FQGKKYDGPEVDVWSLGVILYTLVSGSLPFDGQ 

NLKELRERVLRGKYRIPFYMSTDCENLLKKFLIL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, I>=Aspartic Acid, 
E^Glutaroic Acid, F^Phenylalanlne, G^^GIycine, H==Hlstidine, 
I«=Isoleucine, K=Lysine, L=Leucine, M==Methionlne, 
N=Asparaginc,P=Proline, Q=Glutaraine, R»Arginine, S^erine, 
T^Thrconine, V«Valine, W==Tryptophan, Y=Tyrosine, 
X=Un known, *=^top codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










NPSKRGTLEQIMKDRWMNVGHE\DDELKPYGEP 

LPVDYKDPRRTELMVSMGYTREEIQDSLVGQRYN 

EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 

SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 

YSKKTQSNNAENKRPEEDRESGRKASSTAKVPA 

SPLPGLERKKTTPTPSTNSVLSTSTNRSRNSPLL\E 

RASL\GQGFHPEWAKTALTMPGSRASTASASAA 

VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 

VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 

PDRTNFPRGVSSRSTFHAGQLRQVRNDQQNLPYG 

VTPASPSGHSQGRRGASGSIFSKFTSKFVRRNLNE 

PESKDR\VETLRPHVV\NSGGNDKEKEEFREAKPR 

SLRFTWSMKTTSSMEP>mMMREIRKVLDANSCQ 

SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 

RLSLNGVRFKRISGTSMAFKNIASKIANELKL 


2980 


A 


120 


3433 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 

TKLTSnER\KT\KLEEALNLA\MEFHNSL\QDFINWLT 

QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 

SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 

SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEAN 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSVVHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPDSITTIKHWITIIRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVXALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRJRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKITOPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2981 


A 


120 


3433 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 

TKLNER\KTmEEALNLA\MEFHNSL\QDFINWLT 

QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 

SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKWQRLVERGRSLDDARKRAKQFHEAW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino - 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine O^Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G=<;iycine, H=>Histidine, 
Msoleucine, K»Lysine, L^Leucinc, M-Methionine, 
N=Asparagine, P^Proline, Q^GIutaminc, R^Aiiginine, S=^erine, 
T»Tlireonine, V=VaIine, W=Tryptoplian, Y^l^rosine, 
X=Unknown, *«Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide Insertion 










SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVWTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLroQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPDSITTIKHWmiRA^ 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2982 


A 


1 


2065 


MAAGGAEGGSGPGAAMGDCAEIKSQFRTREGF 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 

LYFYPGCCRRGSQRWHTPLTPFLPPLKSIDLNKPI 

DKRIYKGTQPTCHDFNQFTAATETISLLVGFSAG 

QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP 

ESESLFLASHASGHLYLYNVSHPCASAPPQYSLL 

KQ\AWGFSFYAAKSKAPRNPLAKWAVGEGPLNE 

FAFSPDGRHLACVSQDGCLRVFHFDSMLLRGLM 

KSYFGGLLCVCWSPDGRYWTGGEDDLVTVWS 

FTEGRVVARGHGHKSWVNAVAFDPYTTRAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRPGSAGQDTQFCLWDLTEDVLYPHP 

PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

ATLTLQERRDRGAEKEHKRYHSLGNISRGGSGG 

SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV 

PLLEPLVCKKIAQERLTVLLFLEDCIITACQEGLIC 

TWARPGKAFTDEETEAQTGEGSWPRSPSKSVVE 

GISSQPGNSPSGTVV 


2983 


A 


3855 


220 


RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 

LQELNANLSNLTSAFEKATAEKIKCQQEADATN 

RVILLANRLVGGLASENIRWAESVENFRSQGVTL 

CGDVLLISAFVSYVGYTTKKYRNELMEKFWIPYI 

HNLKVPIPITNGLDPLSLLTDDADVATWNNQGLP 

SDRMSTENATILGNTERWPLIVDAQLQGIKWIKN 

KYRSELKAIRLGQKSYLDVIEQATSEGDTLLIENI 

GETVDPALDPLLGRNTIKKGKYIKIGDKEVGVPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cy5teine, INAspartic Acid, 
E=Giutamic Acid, F=Phenylala nine, G^Glycine, H=Histidine» 
I«Isoieucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P^Prolinc, Q^Glutamine, R=Arginine, S=Scrine, 
T=Threonine, V-Valine, W=Tryptophan, Y«Tyrosine, 
X»linknown, *«Stop codon, A^possible nucleotide deletion, 
\=possible nucleotide insertion 




• 






QVPPDPTHQVLQPTLQARDAGSVHNLINFLVTRD 

GLEDQLLAAVVAKERPDLEQLKANLTKSQNEFK 

IVLKELEDSLLARLSAASGNFLGDTALVENLETT 

KHTASEIEEKVVEAKITEVKINEARENYRPAAER 

ASLLYFILNDLNKINPVYQFSLKAFNVVFEKAIQR 

TTPANEVKQRVINLTDEITYSVYMYTARGLFERD 

mFLAQVTFQVLSMKKELNPVELDFLLRFPFKA 

GVVSPVDFLQHQGWGGIKALSEMDEFKNLDSDI 

EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 

LCMVRCLRPDRMTYAJKNFVEEKMGSKFVEGRS 

VEFSKSYEESSPSTSIFFILSPGVDPLKDVEALGKK 

LGFTIDNGKLHNVSLGQGQEVVAENALDVAAEK 

GHWVILQNIHLVARWLGTLDKKLERYSTGRHED 

YRVFERAEPAPSPETHIIPQGILENAIKITNEPPTGM 

YANLYKALDLFTQDTLEMCTKEMEFKCMLFAL 

CYFHAVVAERRKFGAQGWNRSYPFNNGDLTISI 

NVLYNYLEANPKVPWDDLRYLFGEIMYGGHITD 

DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFQIPP 

NLDYKGYHEYIDENLPPESPYLYGLHPNAEIGFL 

TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 

KAVLDDILEKIPETFNMAEIMAKAAEKTPYVW 

AFQECERMNILTNEMRRSLKELNLGLKGELTITT 

DVEDLSTALFYDTVPDTWVARAYPSMMGLAAW 

YANLLLRIRELEAWTTDFALPTTVWLAGFFNPQS 

FLTAIMQSMARKNEWPLDKMCLSVEVTKKNRE 

DMTAPPREGSYVYGLFMEGARWDTQTGVIAEA 

RLKELTPAMPVIFIKAIPVARMETKNIYECPVYKT 

RIRGPTYVWTFNLKTKEKAAKWILAAVALLLQV 


2984 


A 


2 


1464 


FVLFPGIAMETPGASASSLLLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 

LQAQKEYLEAEENGDLERMRQIAIKFGSALGKM 

SREPPPPYVTPATFETPEVHAGTGVVGNKPRPRG 

RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 

FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 

LELPSAEHQAIESSQASVETWKYKAKNSLMYYP 

EGVPDEEQLFKKPRQVVHKNTRFLRDPFSQALSR 

CQLQQAAALNAQHKQGKVGPDGKELIPQESPRV 

GGFGFVATPSPAPGVNESPMMTWGEVENTPLRV 

EGSETPYVDRTPGPAFKILEPGRRERLGLKMANE 

AAAKNRAKKQEALRRVTENLASLTPKGLSPAMS 

PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 

DNLLQLPARRKASDFF 


2985 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGSXGGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVTIESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A°Alanine CBCysteine, D=Aspartic Acid, 
E«Glutamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleudne, KpLysine, I/^Leucine, M^Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R^Arginine, S^erine, 
T==Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X-Unknown, *^top codon,/»possible nucleotide deletion, 
^possible nucleotide insertion 










WFDGKJEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2986 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVTIESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMhfFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2987 


A 


1376 


898 


GGAKAGGAPHPFTLPFRHVGGLSAAPEEVEGML 

WAGARQHGRNWRKRETSPGTQGPLPPVPRA^PP 

GPDG\PHAIAPTLSWA1PRQQCSPQPGRLNALPPD 

RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 

CPPQEPSLRSSRNRLREGQTFGRMEI 


2988 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLATOP 

LRVAPLPLYAAIFLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSIILLTMYASVLLLAALSADLC 

FLALGPAW\CLRFS/GACGVQVACGAAWTLALL 

LTVPSAIYRRLHQEHFPARLQCVVDYGGSSSTEN 

AVTAIRFLFGFLGPLVAVASCHSALLCWAARRC 

RPLGTAIWGFFVCWAPYHLLGLVLTVAAPNSA 

LLARALRAEPLIVGLALAHSCLNPMLFLYFGRAQ 

LRRSLPAACHWALRESQGQDESVDSKKSTSHDL 

VSEMEV 


2989 


A 


27 


4074 


KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP 

YGYQLDLDFLKYVDDIQKGNHKRLNIQKRRKPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPKHNLHVTKTLMETRRRLEQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSIRH 

SPLSSGISTPVTNVSPMHLQHIREQMAIALKRLKE 

LEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRA 

ASQINVCGVRKRSYSAGNASQLEQLSRARRSGG 

ELYIDYEEEEMETVEQSTQRIKEFRQL\TADMQA 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIWYHRGSRSCKDAAVGTLVEMRNCGVSVTEA 

MLGVMTEADKEIELQQQTIESLKEKIYRLEVQLR 

ETTffl^REMTKLKQELQAAGSRKKVDKATMAQP 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E<=GlDtamic Acid, F=Phenylalanine, G<=Glycine, H^Histidine, 
I-Isoleucine, K=Lysine, Lr=Leucine, M-Methionine, 
K=Asparagine, P^Proline, Q^Glutamine, R»Arginine, S==Serine) 
T-Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X«Unknown, *=Stop codon, ^'possible nucleotide deletion, 
\Fpossible nucleotide insertion 










LVFSKVVEAVVQTRDQMVGSHMDLVDTCVGTS 

VETNSVGISCQPECKNKWGPELPMNWWIVKER 

VEiS4HDRCAGRSVEMCDKSVSVEVSVCETGSNTE 

ESVNDLTLLKTNLNLKEVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STDLEQVHQFTNTETATLIESCTNTCLSTLDKQTS 

TQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLL 

SGHSGFDRPSAVKTKESGVGQININDNYLVGLK 

MRTIACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHYIERIQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVM 

KSASTEELRNPDFQKTSLGKITGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKPISSLDAFPTQ 

EGTLSPVNLTDDQIAAGLYACTNNESTLKSIMKK 

KDGNKDSNGAKKNLQFVGINGGYETTSSDDSSS 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNIEGLKSARVEDEMQVQECEPEKVEIRE 

RYELSEKMLSACNLLKNTINDPKALTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAISP 

DYLRYVINLADGNGNTALHYSVSHSNFEIVKLLL 

DADVCNVDHQNKAGYTPIMLAALAAVEAEKDM 

RIVEELFGCGDVNAKASQAGQTALMLAVSHGRl 

DMVKGLLACGADVNIQDDEGSTALMCASEHGH 

VEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGH 

KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 

RGSFD 


2990 


A 


69 


1687 


ERLRPGQRAIRGPVPAAGACASLPPRAGPAQGRH 
AALGGAEPGSHLHCGVRLQRREEPGGQQRLLPQ 
RGGSAQTGHQHPGPYECQCPGPQPGGTTPALLSL 
ELEETRGPPASANPDKDHSTQPGTMGRKKIQISRI 
LDQR^^lQVTFTKJlKTGLMKKAYELSVLCDCEIA 
LIIFNSATRLFQYASTDMDRVLLKYTEYSEPHESR 
TNTDILETLKRRGIGLDGPELEPDEGPEEPGEKFR 
. RLAGEGGDPALPRPRLYPAAPAMPSPDVVYGAL 
PPPG\CDPSGLGEALPAQSRPSPFRPAAPKAGPPG 
LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 
GPRGGLNTSRSLYSGLQNPCSTATPGPPLGSFPFL 
PGGPPVGAEAWARRVPQPAAPPRRPPQSSIKSER 
LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPP\ 
CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 
\TSLQAFSEKTHTVTAPLRGGGLEVGGWTQSSAG 
GLLSFFLFVCISTNKNARGVRGPEKK 


2991 


A 


3 


1159 


IPQPLHCASPKEEMSLRCGDAARTLGPRVFGRYF 

CSPVRPLSSLPDKKKELLQNGPDLQDFVSGDLAD 

RSTWDEYKGNLKRQKGERLRLPPWLKTEIPMGK 

NYNKLKNTLRNLNLHTVCEEARCPNIGECWGGG 

EYATATATIMLMGDTCTRGCRFCSVKTARNPPP 

LDASEPYNTAKAIAEWGLDYWLTSVDRDDMP 

DGGAEHIAKTVSYLKERNPKILVECLTPDFRGDL 

KAIEKVALSGLDVYAHNVETVPELQSKVRDPRA 

NFDQSLRVLKHAKKVQPDVISKTSIMLGLGENDE 

QVYATMKALREADVDCLTLGQYMQPTRRHLKV 

EEYITPEKFKYWEKVGNELGFHYTASGPVLVRSS 

YKAGEFFLKNLVAKRKTKDL 


2992 


A 


3 


1636 


PVPGVPTSPPSCCPQDMQGPWVLLLLGLRLQLSL | 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (AsAlanine C^Cysteine, D=Aspartic Acid, 
E-KSlutamic Acid, F^Phenylalanine, Gs=Glycine, H»Histidine, 
I=lsoleucine, K»Lysine, L^Leucine, M^Methionine, 
N»Asparagine, P«ProJinc, Q^Glutaniine, R»Arginine, Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y^Tyroslne, 
XsUnlcnown, *:=Stop codon, A>possible nucleotide deletion, 
V^possible nucleotide insertion 










GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV 

AKNLILFLGDGLGVPTVTATRILKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGVVTTTRVQHASPAGTYAHTV 

NRNWYSDADMPASARQEGCQDIATQLISNMDID 

VILGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

KNLVQEWLAKHQGAWYVWNRTBLMQASLDQS 

VTHLMGLFEPGDTKYEIHRDPTLDPSLMEMTEA 

ALRLLSRNPRGFYLFVEGGRIDHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

VFSFGGYTLRGSSIFGLAPSKAQDSKAYTSILYGN 

GPGYVFNSGVRPDVNESESGSPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP 


2993 


A 


3 


685 


DAWARLLKMNRLFGKAKPKAPPPSLTDCIGTVD 

SRAESIDKKISRLDAELVKYKDQIKKMREGPAKN 

MVKQKALRVLKQKRMYEQQRDNLA\NSHSTW\ 

TS\HYTIQSLKDTKTWDAMKLGVKEMKKAYKQ 

VKIDQIEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PAIPEGVPTDTKNKDGVLVDEFGLPQIPAS 


2994 


A 


1710 


161 


RRCELTPFIIKTLILPKSWGAFPEDWMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 

SHQLQQLMVRGGPAGGQNMNVDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPRRFEHGSPS 

YIQVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTQSGGTIHHLGPQSPAAAGGAGLQPLASP 

SHITTANLPPQISSIIQGQLVQQQQVLQGPPLPRPL 

GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 

VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 

MAQMRKQCLDYHHQEMQALKEVFKEYLIELFF 

LQHFQGNMMDFLAFKERLYGPLQAYLRQNDLDI 

EEEEEE\HFEVINDEVKVVARKHGQPGTPVAIAT\ 

QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 

GFEDSMC 


2995 


A 


3 


924 


SAPSGIDASTHAFARCKHPINVRRDPSIPIYGLRQS 

ILLNTRLQDCYVDSPALTNIWMARTCAKQNINAP 

APATTSSWEVVRNPLIASSFSLVKLVLRRQLKNK 

CCPPPCKFGEGKLSKRLKHKDDSVMKATQQARK 

RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

RQDLEDRYAEHVAA'RQALPQDSGTAAWKGXRV 

LLPETQKRQQLSEDTLTIHGLPTEGYQALYHAVV 

EPMLWNPSGTPKRYSLELGKAIKQKLWEALCSQ 

GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 

SKK 


2996 


A 


3 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEELWQDAEQIKRCQEKHNKLLSRTTFLNKKJLN 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 

KSFKHNLDLHIHNKSNAAKNLDKTIGHGQVFTQ 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKFPIGEKANTCTEFGKIFTQRSHFFAPQKIHT 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D»Aspartic Acid, 
EeGlutamic Acid, Fa^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K»Lysine, l^Leuclne, M^Methionine, 
N=Asparagine, P»Proline, O^lutamine, R<=Arginine, S=^erine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X»Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=po55ible nucleotide insertion 










VEKPHELSKCVNVFTQKPLLSIYLRVHRDEKLYA 

CTKM/CGKGLHPRNSELIMHEKTHTREKPYKCNE 

\CGKSFFQVSSLLRHQTTHTGEKLFECSECGKGFS 

LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 

MHQRIHTGERSYICTQCGQAFIQKAHLIAHQRIH 

TGEKPYECSDCGKSFPSKSQLQMHKRIHTGEKPY 

ICTECGKAFTNRSNLNTHQKSHTGEKSYICAECG 

KAFTDRSNFNKHQTIHTGEKPYVCADCGRAFIQK 

SELITHQRIHTTEKPYKCPDCEKSFSKKPHLKVHQ 

RfflTGEKPYICAECGKAFTDRSNFNKHQTIHTGD 

KPYKCSDCGKGFTQKSVLSMHRNIHT 


2997 


A 


3 


1763 


AASTRTMGSRHFEGIYDHVGHFGRFQRVLYFICA 

FQNISCGIHYLASVFMGVTPHHVCRPPGNVSQVV 

FHNH'SNWSLEDTGALLSSGQKDYVTVQLQNGEI 

WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 

YIYDQNTWKSTAVTQWNLVCDRKWLAMLIQPL 

FMFGGPTGIGA^TFGYF^SDRLGRRVVLWATSSS 

MFLFGIAAAFAVDYYTFMAARFFLAMVASGYLV 

VGFVYVMEFIGMKSRTWASVHLHSFFAVGTLLV 

ALTGYLVRTWWLYQMILSTVTVPFILCCWVLPE 

TPFWLLSEGRYEEAQKMVDIMAKWNRASSCKLS 

ELLSLDLQGPVSNSPTEVQKHNLSYLFYNWSITK 

RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL 

FLLGVVEIPAYTFVC1AMDKVGRRTVLAYSLFC\S 

ALACGVVMVIPQKHYILGVVTAMWGKILPIGAA 

FG\LIYLYTAELYPTIVRSLAVGSGSMVCRLASIL 

APFSVDLSSIWIFIPQLFVGTMALLSGVLTLKLPE 

TLGKRLATTWEEAAKLESENESKSSKLLLTTNNS 

GLEKTEAITPRDSGLGE 


2998 


A 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DVVGFDLDHTLCRYNLPESAPLIYNSFAQFLVKE 

KGYDKELLNVTPEDWDFCCKGLALDLEDGNFL 

KLANNGTVLRASHGTKMMTPEVLAEAYGKKEW 

KHFLSDTGMACRSGKYYFYDNYFDLPGALLCAR 

VVDYLTKLNNGQKTFDFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKXWLRQ 

LKNAGKILLLITSSHSDYCRLLCA\YILGNDFTDLF 

DIVITNALKPGFFSHLPSQRPFRTLENDEEQEALP 

SLDKPGWYSQGNAVHLYELLKKMTGKPEPKW 

YFGDSMHSDIFPARHYSNWETVLILEELRGDEGT 

RSQRPEESEPLEKKGKYEGPKAKPLNTSSKKWGS 

FFMDSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

EAIAELPLDYKFTRFSSSNSKTAGYYPNPPLVLSS 

DETLISK 


2999 


A 


320 


2417 


LRRRKMTPQSLLQTTLFLLSLLFLVQGAHGRGHR 

EDFRFCSQRNQTHRSSLHYKPTPDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH 

AGRLHLLYGKRDFLLSDKASSLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQNISLPSAASFTFSFHSPPH 

TGAHNASVDMCELKRDLQLLSQFLKHPQKASRR 

PSAAPASQQLQSLESKLTSVRFMGDMGSFEEDRI 

NATVWKLQPTAGLQDLHIHSRQEEEQSEIMEYS 

VLLPRTLFQRTKGRSGEAEKRLLLVDFSSQALFQ 

DKNSSQVLGEKVLGIVVQNTKVANLTEPVVLTF 

QHQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Aianine C^Cysteine, D»Aspartic Acid, 
&=Glntamic Acid, F=Pbenylalanine, G^Giycine, H^^Histidine, 
Msoleucine, K-Lysine, L=Leucine, M-Metliionine, 
N»Asparagine, P^Prolinc, Q»Glutamine, R^Argininc, S=Serine» 
T^'TIireomne, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, A^ossible nucleotide deletion^ 
V=!poss!ble nucleotide insertion 










ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 

KHYLSLLSYVGCWSALACLVTIAAYLCSRVPLP 

CRRKPRDYTIKVHMNLLLAVFLLDTSFLLSEPVA 

LTGSEAGCRASAIFLHFSLLTCLSWMGLEGYNLY 

RLWEVFGTYVPGYLLKLSAMGWGFPIFLVTLV 

ALVDVDNYGPIILAVHRTPEGVIYPSMCWIRDSL 

VSYIT>a.GLFSLVFLFNMAMLATMVVQILRLRPH 

TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 

LWLYLFSIITSFQGFLIFIWYWSMRLQARGGPSP 

LKSNSDSARLPISSGSTSSSRI 


3000 


A 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDKEPLIELFVK 

AGSDGESIGNCPFSQRLFMILWLKGVVFSVTTVD 

LKKKPADLQNLAPGTHPPFITFNSEVKTDVNKIEE 

FLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSA 

YIKNSRPEANEALERGLLKTLQKLDEYLNSPLPD 

EIDENSMEDIKFSTRKFLDGNEMTLADCNLLPKL 

HIVKWAKKYRNFDIPKEMTGIWRYLTNAYSRD 

EFTNTCPSDKEVEMYSDVAKRLHQVKSRLLKE 

VSFMSSP 


3001 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

HPPSVWSPALPSCFAGPCPLLPLSDTQGWWGPN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYT1SGSSTSQSGK 

CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 


3002 


A 


909 


2799 


VEEAWTVWLHWGVRECLLEEETNQKEEAASSN 

WTKARGPFWQEDWVWDMRLKMTTRNFPEREV 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGDSDACGKGFNHSMEVIHGRNPVREKPYKY 

PESVKSFNHFTSLGHQKIMKRGKKSYEGKNFENI 

FTLSSSLNENQRNLPGEKQYRCTECGKCFKRNSS 

LVLHHRTHTGEKPYTCNECGKSFSKNYNLIVHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTFNRNSSLELHQRTHTGEKPYRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKIHSGEKPYECKECGKTFIESAYLIRHQRIH 

TGEKPYGCNQCQKLFRNIAGLIRHQRTHTGEKPY 

ECNQCGKAFRDSSCLTKHQRIHTKETPYQCPECG 

KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 

SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 

VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A 


2 


1489 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 

MKGSCGIGGGIGGGSSRJSSVLAGGSCRAPSTYG 

GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 

FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 

VGSEKVTMQNLNDRLASYLDKVRALEEANADL 

EVKIRDWYQRQRPSEIKDYSPYFKTIEDLRlSrKIIA 

ATIENAQPILQIDNARLAADDFRTKYEHELALRQ 

TVEADVNGLRRVLDELTLARTDLEMQIEGLKEE 

LAYLRKNH*EEMLALRGQTGGEVNVETDAAPG 

\nDLSCILNEMRNQYEQMAEKNRRDAETWFLSKT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=A5partic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H^'Histidine, 
l=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparaginc, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y«Tyrosine, 
X^Unknown, *»Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EELNKEVASNSELVQSSRSEVTELRRVLQGLEIEL 
QSQLSMKASLENSLEETKGRYCMQLSQIQGLIGS 
VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 
TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 
SSRQTRPILKEQSSSSFSQGQSS 


3004 


A 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGCTFK 

DKVLVAARRNASAVVLYNEERYGNITLPMSHAG 

TGNIVVIMSYPKGREILELVQKGIPVTMTIGVGT 

RHVQEFISGQSVVFVAIAFITMMnSLAWLIFYYIQ 

RFLYTGSQIGSQSHRKETKKVIGQLLLHTVKHGE 

KGroVDAENCAVCIENFKVmiRILPCKHIFHRIC 

IDPWLLDHRTCPMCKLDVIKALGYWGEPGDVQE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 

GPIS 


3005 


A 


184 


2552 


TMTfflQFLLLFLFWVCLPHFCSPEIMFRRTPVPQQ 

RILSSRVPRSDGKILHRQKRGWMWNQFFLLEEY 

TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 

LFIIDEKTGDIHATRRIDREEKAFYTLRAQAINRR 

TLRPVEPESEFVIKIHDINDNEPTFPEEIYTASVPE 

MSVVGTSVVQVTATDADDPSYGNSARVIYSILQ 

GQPYFSVEPETGIIRTALPNMNRENREQYQVVIQ 

AKDMGGQMGGLSGTTTVNITLTDVNDNPPRFPQ 

NTIHLRVLESSPVGTAIGSVKATDADTGKNAEVE 

YRIIDGDGTDMFDIVTEKDTQEGIITVKKPLDYES 

RRLYTLKVEAENTHVDPRFYYLGPFKDTTIVKISI 

EDVDEPPVFSRSSYLFEVHEDIEVGTIIGTVMARD 

PDSISSPIRFSLDRHTDLDRIFNIHSGNGSLYTSKP 

LDRELSQWHl^TVIAAEINOTKETTRVAVFVRIL 

DANDNAPQFAVFYDTFVCENARPGQLIQTISAVD 

KDDPLGGQKFFFSLAAVNPNFTVQDNEDNTARIL 

TRKNGFNRHEISTYLLPVVISDNDYPIQSSTGTLTI 

RVCACDSQGNMQSCSAEALLLPAGLSTGALIAIL 

LCmLLVIVVLFAALKRQRKKEPLILSKEDIRDNr/ 

SYNDEGGGEEDTQAFDIGTLRNPAAffiEICKLRJRD 

IBPETLFIPRRTPTAPDNTDVRDFINERLKEHDLDP 

TAPPYDSLATYAYEGNDSIAESLSSLESGTTEGD 

QNYDYLREWGPRFNKLPQKYGGGESDKDS 


3006 


A 


2 


541 


GRVDKTWWGKSVGIMLTELEKALNSIIDVYHKY 

SLKGNFHAVYRDDLKKLLETECPQYIRKKGAD 

VWFKELDINTDGAVNFQEFLILVIKMGVAALNSII 

D VYHK YSLIKGNFHA V YRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLILVIKMG 

VGSPQKKVASYF 


3007 


A 


1 


1253 


MYEGIRCLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGIEAINVPEPIPDSYYRDMATWPTHAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIE 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKLAPTHWPPEKRVAYCFEVAAQRSPDKKT 

CPMKEGNPFGPFWDQFHVSFNKSELFTGISFSAS 

YREQWSQRFSPKEHPVLALPGAPAQFPVLEEHRP 

LQKYMVWSDEMVKTGEAQIHAHLVRPYVGIHL 

RIGSDWKNACAMLKDGTAGSHFMASPQCVGYS 

RSTAAPLTMTMCLPDLKEIQRAVKLWVRSLDAQ 

SVYVATDSESYVPELQQLFKGKVKWSLKPEVA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acia rcsioue oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
pcpiiuc 
sequence 


Amino acid sequence (A=Alanine OCystelne, D=Aspartic Acid, 
E^lutamic Acid, F=PhenylaIanine, G«GIycine, H=Histidine, 
I=IsoIeucine, K^Lysine, l/»Leucine, M^Methionine, 
N»Asparagine, psProline, Q=Glutamine, R=Argininc, S=Scrine, 
T»Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *"Stop codon, ^possible nucleotide deletion, 

n*mf»lAA^I<lA lneaa>#tAn 

x^^ossioie nucicouue insenion 










QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


3008 


A 


3136 


1898 


TARGGGSEPGPTMAANYSSTSTRREHVKVKTSS 

QPGFLERLSETSGGMFVGLMAFLLSFYLIFTNEG 

RALKTATSLAEGLSLVVSPDSfflSVAPENEGRLV 

HnGALRTSKLLSDPNYGVHLPAVKLRRHVEMY 

QWVETEESREYTEDGQVKKETRYSYNTEWRSEH 

NSKNFDREIGHKNPRAMAGESFMATAPFVQIGRF 

FLSSGLIDKVDNFKSLSLSKLEDPHVDIIRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

HWTVIARQRGDQLVPFSTKSGDTLLLLHHGDFS 

AEEVFHRELRSNSMKTWGLRAAGWMAMFMGL 

NLMTRILYTLVDWFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALLIAGLALVPBLVAR 

TRVPAKKLE 


3009 


A 


93 


659 


DAAVAMTAQGGLVANRGRRFKWABELSGPGGG 

SRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQET 

DRILVEKRCWDIALGPLKQIPMNLFIMYMAGNTI 

SIFPTMMVCMMAWRPIQALMAISATFKMLESSS 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


3010 


A 


2 


1041 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 

VVPECTMASSNTVLMRLVASAYSIAQKAGMIVR 

RVIAEGDLGIVEKTCATDLQTKADRLAQMSICSS 

LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 

QPCPSQYSAIKEEDLVVWVDPLDGTKEYTEGLL 

DNVTVLIGIAYEGKAIAGVINQPYYNYEAGPDAV 

LGRTIWGVLGLGAFGFQLKEVPAGKHniTTRSH 

SNKLVTDCVAAMNPDAVLRVGGAGNKnQLIEG 

KASAYVFASPGCKKWDTCAPEVBLHAVGGKLTD 

fflGNVLQYHKDVKHMNSAGVLATLRNYDYYAS 

RVPESKNALVP 


3011 


A 


291 


1452 


SPQKTMRSHTITMTTTSVSSWPYSSHRMKFITNH 

SDQPPQNFSATPNVTTCPMDEKLLSTVLTTSYSVI 

FIVGLVGNIIALYVFLGIHRKRNSIQIYLLNVAIAD 

LLLIFCLPFRIMYHINQNKWTLGVILCKWGTLFY 

MNMYISIILLGHSLDRYIKINRSIQQRKAITTKQSI 

YVCCIVWMLALGGFLTMIILTLKKGGHNSTMCF 

HYRDKHNAKGEAIFNFILVVMFWLIFLLIILSYIKI 

GKNLLRISKRRSKFPNSGKYATTARNSFIVLIIFTI 

CFVPYHAFRFIYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSFNSCLDPVMYFLMSSNIRKIMCQLLFRRF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDNIQVQENFNISRIYGKWYNLAIGSTCPW 

LKKIMDRMTVSTLVLGEGATEAEISMTSTRWRX 

GVCEETSGAYEKTDTDGKFLYHKSKWNITMESY 

WHTNYDEYAIFLTKKFSRHHGPTITAKLYGRAP 

QLRETLLQDFRVVAQGVGIPEDSIFTMADRGECV 

PGEQEPEPILlPRVRRAVLPQEEEGSGGGQLVTEV 

TKKEDSCQLGYSAGPCMGMTSRYFYNGTSMAC 

ETFQYGGCMGNGMNFVTEKECLQTCRTVAACN 

LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 

GNGNKPYSEKECREYCGVPGDGDEELLRFSN 


3013 


A 


67 


379 


RQMALLKANKDLISAGLKEFSVLLNQQVFNDPL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^sAlanine OOysteine, ]>=Aspartic Acid, 
E^lutamic Add, F^Phenylalanine, G^Glycine, H^^Histidine, 
l»I$oleucine, K«Lysine, l^Leucine, M»Metbionine, 
N=Asparagine» P^FroIine, Q=Glutamine, R=Arginine, S^Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
\=possible nucleotide insertion 










VSEEDMVTVVEDWMNFYINYYRQQVTGEPQER 

DKALQELRQELNTLANPFLAKYRDFLKSHELPSH 

PPPSS 


3014 


A 


1 


373 


GTSWSTLRAVMSASVVSVVSRVLEEYLSSTPQRL 
KLLDAYLLmLTGALQFGYCLFVLTFHFNSLLLF 
FFFCVGSFHSNVYFLLFTLSFLCFLFIAYFFLIRFFS 
LFIWFFHVFFIELSLFYF 


3015 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDHST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT 


3016 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDIIST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT 


3017 


A 


38 


704 


EAHPGGQLGSERNGVRMDEDVLTTLKILIIGESG 
VGKSSLLLRFTDDTFDPELAATIGVDFKVKTISVD 

GNKAKLAIWDTAGQERFRTLTPSYYRGAQGVIL 
VYDVTRRDTFVKLDNWLNELETYCTRNDIVNM 
LVGNKIDKENREVDRNEGLKFARKHSMLFIEAS 
AKTCDGVQ'CAFEELVEKIIQTPGLWESENQNKG 
VKLSHREEGQGGGACGGYCSVL 


3018 


A 


2640 


2861 


APVLILQMVKLSIVLTPQFLSHDQGQLTKELQQH 
VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC 
HTSHSG 


3019 


A 


1307 


711 


PGITMAASLVGKKIVFVTGNAKKLEEVVQILGDK 

FPCTLVAQKIDLPEYQGEPDEISIQKCQEAVRQV 

QGPVLVEDTCLCFNALGGLPGPYIKWFLEKLKPE 

GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 

RGRTSGRIVAPRGCQDFGWDPCFQPDGYEQTYA 

EMPKAEKNAVSHRFRALLELQEYFGSLAA 


3020 


A 


1202 


180 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAA 
LVFYSCiniGLFVNITALWWSCTTKKRTTVTIYM 
MNVALVDLIFIMTLPFRMFYYAKDEWPFGEYFC 
QILGALTVFYPSIALWLLAFISADRYMAIVQPKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F'^Phenylalanine, G=Giycine, H=HisHdine, 
I=IsoIeucine) K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon» A^possible nucleotide deletion, 
\spossible nucleotide insertion 










AKHLKN rCKA VLACVG V WlMTLn"!" IPLLLLYK 

DPDKDSTPATCLKISDUYLKAVNVLNLTRLTFFF 

LIPLFIMIGCYLVIIHNLLHGRTSKLKPKVKEKSIRI 

nTLLVQVLVCFMPFHICFAFLMLGTGENSYNPW 

GAFTTFLMNLSTCLDVILYYIVSKQFQARVISVM 

LYK^m.RSMRRKSFRSGSLRSLSNINSEML 


3021 


A 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 
KRKKPRRYWEEETVPTTAGASPGPPRNKKNREL 
RPQRPKNAYILKKSRISKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 
PHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LNLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 
WVTKKLMCEINVMEAVRDIRFLHSEALLAVAQN 
. IRWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLA 
tASETGFLTYLDVSVGklVAALNARAGRLDVMS 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 
NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 
DVISLEQGKXEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDRFVR 


3022 


A 


I 


2249 


MTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPVSAVEVATEGRDREVAKVGQRFCDTTSGE 

LRQARDRDCCVRMPAPVGRRSPPSPRSSMAAVA 

LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 

VRTSKGNTPTQKTHLSEIKMCVPVLKDILPAAEH 

QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 

EEEPWKRKVDEATFVTGCRFHVLNYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 

KWGEYRKASSHKHTLVQHQSVCSEGGLYECSK 

CEKAFTCKNTLVQHQQIHTGQK^^FECSECEESFS 

KKCHLILHKIIHTGERPYECSDREKAFIHKSEFIHH 

QRRHTGGVRHECGECRKTFSYKSNLIEHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRIHSGERPYECRECGKSFRQFSN 

LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

LIRHRRVHTGEMPYQCSDCGKSFSCKSELIQHRRI 

HSGERPYECSECGKSFSRKSNLIRHRRVHTEERP 


3023 


A 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 

AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 

VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

IIAAYQRFCSRPPKGFGKYFPNGKNGKKASEPKE 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKKD 

DSHWWSRFQKGDIPWDDKDFRMFFLWTALFWG 

GVMFYLLLKRSGREITWKDFVNNYLSKGVVDRL 

EVVNKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E'^GIutamic Acid, F^Phenylalanine, G-Glycine, H^Histidine, 
I=l5oleucine, K=Lysine, l^Leucine, M^Methionine, 
N=Asparagine, P^'Proline, Q=GIutamine, R«=Arginine, S=sSeriner 
T^Threonine, V=Valinc, W=Tryptophan, y«Tyrosine, 
X-Unknowo, *"5top codon, A^^ssible nucleotide deletion, 
\-possible nucleotide insertion 










FERNLETLQQELGIEGENRVPWYIAESDGSFLLS 

MLPTVLIIAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEIDVKFKDVAGCEEAKLEIMEFV 

NFLKNPKQYQDLGAKIPKGAILTGPPGTGKTLLA 

KATAGEANVPFITVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFIDEIDAVGRKRGRGNFGGQSE 

QENTLNQLLVEMDGFNTTTNVVILAGTNRPDILD 

PALLRPGRFDRQIHGPPDIKGRASIFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANVCNEAA 

LIAARHLSDSINQKHFEQAIERVIGGLEKKTQVLQ 

PEEKKTVAYHEAGHAVAGWYLEHADPLLKVSII 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRITTGAQDDLRKVTQSAYAQIVQ 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILIM^AYKRTVALLTEKKADVEKVALLL 

LEKEVLDKNDMVELLGPRPFAEKSTYEEFVEGT 

GSLDEDTSLPEGLKDWNKEREKEKEEPPGEKVA 

N 


3024 


A 


274 


1455 


LRACSLPSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAHSRVQCRIVALDLRSHGETKVKNPED 

LSAETMAKDVGNVVEAMYGDLPPPIMLIGHSMG 

GAIAVHTASSNLVPSLLGLCMIDVVEGTAMDAL 

NSMQNFLRGRPKTFKSLENAIEWSVKSGQIRNLE 

SARVSMVGQVKQCEGITSPEGSKSIVEGIIEEEEE 

DEEGSESISKRKKEDDMETKKDHPYTWRIELAKT 

EKYWDGWFRGLSNLFLSCPIPICLLLLAGVDRLD 

KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 

AEAVATFLIRHRFAEPIGGFQCVFPGC 


3025 


A 


621 


306 


YHGGQRGRAGGSFRSVQGWGGQLR>JPFRTSKSL 
SWKGLSSLLFPLYNLQMGRPRDRKELGRGHSPP 
HLEGPHMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 


3026 


A 


1533 


454 


AKVPQSTREEKRENGLEARSPAINLMGFNVEEM 

YEAHAWIQRILSLQNHHUENNHILYLGRKEHDIL 

SQLQKTSSVSITEIISPGRTELEIEGARADLIEVVM 

NIEDMLCKVQEEMARKKERGLWRSLGQWTIQQ 

QKTQDEMKENIIFLKCPVPPTQELLDQKKQFEKC 

GLQVLKVEKIDNEVLMAAFQRKXKMMEEKLHR 

QPVSHRLFQQVPYQFCNVVCRVGFQRMYSTPCD 

PKYGAGIYFTKNLKNLAEKAKKISAADKLIYVFE 

AEVLTGFFCQGHPLNIVPPPLSPGAIDGHDSVVD 

NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 

SSGPMRPFAQHPWRGFASGSPVD 


3027 


A 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMGFIFSK 
SMNESMKNQKEFMLMNARLQLERQLIMQSEMR 
ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGAI 
KKKKPAFLVPIVPLSFILTYQYDLGYGTLLERMK 
GEAEDILETEKSKLQLPRGMITFESIEKARKEQSR 
FFIDK 


3028 


A 


876 


1226 


AVGKEPESSSTWVRDREGfflRSRRSMKMLWKLT 
DNIKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 
QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to iirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, D^Aspartic Acid, 
E'^GIutamic Acid, F^Phenylaianine, G=Glycine, HHHistidine, 
Isfsoleucine, K^^Lysine, L^Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R»Arginine, S=Serine, 
T=Thrconinc, V^Valinc, W=Tryptophan, Y^Tyrosinc, 
X»Unknown, *«Stop codon,/=pos5ible nucleotide deletion, 
V=possible nucleotide insertion 


3029 


A 


3 


1731 


FREGRFGSSCAVAAPLAGFQGLIECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQKLPELRGVGDPAMISSNTSYL 

SSRGRMIKWFWDSAEEGYRTYHMDEYDEDKNP 

SGIINLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGPILFLREEVAKFLSFYCKSPVPLRPE 

NVVVLNGGASLFSALATVLCEAGEAFLIPTPYYG 

AITQHVCLYGNIRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLILISPQNPLGDVY 

SPEELQEYLVFAKRHRLHVIVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

AQLLRDRDWINQVYLPENHARLKAAHTYVSEEL 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML 

LWRRFLDNKVLLSFGKAFECBCEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 


3030 


A 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

VFSTSSLMLALSRHSLLSPLLSVTSFRRFYRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 

KRGMLENCILLSLFAKEHLQHMTEKQLNLYDRLI 

NEPSNDWDIYYWATEAKPAPEIFENEVMALLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 


3031 


A 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSL 

CWSSCGQHPVQATHRGAVSNSLMLCILKLASQM 

PLENTTVQQMVFMLLSNLALSHDCKGVIQKSNF 

LQNFLSLALPKGGNKHLSNLTILWLKLLLNISSGE 

DGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLI 

FH>fVCFSPANKPKILANEKVITVLAACLESENQN 

AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 

EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 

NSS 


3032 


A 


2 


1242 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDIP 

LSRPKKKKPRTKNTPASASLEGLAQTAGRRPSEG 

NEPSTKELKEHPEAPVQRRQKKTRLPLELETSST 

QKKSSSSSLLRNENGIDAEPAEEAVIQKPRRKTK 

KTQPAELQYANELGVEDEDIITDEQTTVEQQSVF 

TAPTGISQPVGKWVEKSRRFQAADRSELIKTTEN 

IDVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 

FLAGCAVWNIVVIYVLAGDQLSNLSNLLQQYKT 

LAYPFQSLLYLLLALSTISAFDRIDFAKISVAIRNF 

LALDPTALASFLYFTALILSLSQQMTSDRIHLYTP 

SSVNGSLWEAGIEEQILQPWIVVNLVVALLVGLS 

WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEIKA 

SS 


3033 


A 


3 


1436 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSRIAKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIR2WTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGARNIRTSER 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSGRE 

HNFTRPNEKGEYEVAEGIGSTVFRAILDYYKTGH 

RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 

HELSNDGARRQFEFYLEEMILPLMVASAQSGERE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrstammo 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AlaDine C'^Cysteine, D-Aspartic Acid, 
E^GIutamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I-lsoleucine, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y»Tyrosine, 
X»Unknoivn, *«Stop codon,/=possible nucleotide deletion, 
V^possible nucleotide insertion 










CHIVVLTDDDWDWDEEYPPQMGEEYSQUYSTK 

LYRFFKYIENRDVAKSVLKERGLKKIRLGIEGYP 

TYKEKVKKRPGGRPEVIYNYVQRPFIRMSWEKE 

EGKSRHVDFQCVKSKSITNLAAAAADIPQDQLV 

VMHPTPQVDELDILPIHPPSGNSDLDPDAQNPML 


3034 


A 


3 


1972 


SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVPIEPNPLRSRQVFKLLCQTFIKMGLLSSF 

TCSDEFSSLRLHHNRAITHLMRSAKERVRQDPCE 

DISRIQKIRSREVALEAQTSRYLNEFEELAELGKG 

GYGRVYKVRNKLDGQYYAIKKILIKGATKTVCM 

KVLREVKVLAGLQHPNIVGYHTAWIEHVHVIQP 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSIIFAEPTPEKEKRFGESDTENQNNKSVKYTTNL 

VIRESGELESTLELQENGLAGLSASSIVEQQLPLR 

RNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSLWDWIVERNKRGREYVDESACPY 

VMANVATKIFQELVEGVFYIHNMGIVHRDLKPR 

NIFLHGPDQQVKIGDFGLACTDILQKNTDWTNR 

NGKRTPTHTSRVGTCLYASPEQLEGSEYDAKSD 

MYSLGVVLLELFQPFGTEMERAEVLTGLRTGQL 

PESLRKRCPVQAKYIQHLTjRJRNSSQRPSAIQLLQS 

ELFQNSGNVNLTLQMKIIEQEKEIAELKKQLNLL 

SQDKGVRDDGKDGGVG 


3035 


A 


110 


1172 


klscpcshgtrvtavrgprlkagvqwhdlgslq 

pppsglkqsshlslssswdfrhapthpetytcpk 

miemeqaeaqlaeldllasmfpgenelivndql 

avaelki)ciekktmegrsskvyf™mnldvsd 

ekmamfsl acilpfkypa vlpeitvrs vllsrsqq 

tqlntdltaflqkhchgdvcilnatewvrehas 

GYVSRDTSSSPTTGSTVQSVDLIFTRLWIYSHHIY 

nkckrknilewakelslsgfsmpgkpgvvcveg 
pqsaceefwarlirklnwkrilirhredipfdgtn 
deterqrkfsifeekvfsvngargnhmdfgqly 
qflntkgcgdvfqmflwv 


3036 


A 


1 - . 


2288 


frfaerraaaaesdvsakmagrsmqaarcptd 

elsltncavvnekdfqsgqhvivrtspnhrytft 

lkthpsvvpgsiafslpqrkwaglsigqeievsly 

tfdkakqcigtmtieidflqkksidsnpydtdkm 

aaefiqqfnnqafsvgqqlvfsfneklfgllvkd 

ieamdpsilngepatgkrqkievglwgnsqvaf 

ekaensslnligkaktkenrqsonpdwnfekmg 

iggldkefsdifrrafasrvfppeiveqmgckhvk 

gillygppgcgktllarqigbcmlnarepkvvng 

peilnkyvgeseanirklfadaeeeqrrlgansg 

lhiiifdeidaickqrgsmagstgvhdtvvnqlls 

kidgveqlnnilvigmtnrpdlideallrpgrlev 

kmeiglpdekgrlqilhihtarmrghqllsadv 

dikelavetknfsgaeleglvraaqstamnrhi 

kastkvevdmekaeslqvtrgdflaslendikp 

afgtnqedyasyimngiikwgdpvtrvlddgel 

lvqqtknsdrtplvsvllegpphsgktalaakia 

eesnfpfikicspdkmigfsetakcqamkkifdda 

yksqlscvvvddierlldyvpigprfsnlvlqal 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Add, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=^Lysine, L=Leucine, M^Methionlne, 
N=A5paraginc, P=Proline, Q=Giutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *^top codon, /possible nucleotide deletion, 
V^ossible nucleotide insertion 










LVLLKXAPPQGRKLLIIGTTSRIODVLQEMEMLNA 
FS1TIHVPNIATGEQLLEALELLGNFKI)KERTTIA 
QQVKGKKVWIGKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 


3037 


A 


1 


1347 


MLDTGSEHLNRILKALPALQSAGSEGQNGSAESL 

GEGGTRDSDRAKRKLRGGNKEIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGTIREIVLPKGLDLDRPKRTRTFFT 

AEQLYRLEMEFQRCQYVVGRERTELARQLNLSE 

TQVKVWFQNRRTKQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 


3038 


A 


924 


501 


TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 

RLETAPSLLLSRMACVISGWALSRGARTWTWAT 

PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 

LPNHLTPPFLYKHLGSVPPSHWRSPLISHSVNILA 

LNWR 


3039 


A 


1263 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLLIL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKIFQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKKISQASSCLQKLLYFNLSAIKE 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHVWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLEILVKEDRDSGVNFQ 

PEDTCARLRCSLHASLLVVTLNPDQCHPSRKRRA 

AIPVPKLSCKNLCHRHQLFINFRDLGWHKWIIAP 

KGFMANYCHGECPFSLTISLNSSNYAFMQALMH 

AVDPEIPQAVCIPTKLSPISMLYQDNNDNVILRHY 

EDMVVDECGCG 


3040 


A 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 

LSTRLPRGRRLGSTEEAGGRSLWFPSDLAELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLVVSYFPDKVALLQRKVEENRNSLFF 

FLLFLRLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 

GLIPYNFICVQTGSILSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHfflSR 

KDT 


3041 


A 


1015 


175 


GLKRRJILCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3042 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

loc&tiob 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I=I$oleucine, K^Lysine, U=Leuclne, M^Methionine, 
N=Asparagine, P=Proline, Q=G!utaminc R=Arginine, S=Serine, 
T=Tlireonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=IJnknown, *=^top codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVKNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3043 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 

ASVIVGHPLDTVKTRLQAGVGYGNTLSCIRVVY 

RRESMFGFFKGMSFPLASIAVYNSVVFGVFSNTQ 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGWSV 

GLGGPVDLIKIRLQMQTQPFRDANLGLKSRAVAP 

AEQPAYQGPVHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFIPYVFLSEWITPEACTGPSPCAV 

WLAGGMAGAISWGTATPMDVVKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3044 


A 


41 


1316 


PPLGAGAGIHARSPHPARRLRLTAAGVGGRASG 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVIEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

dedldqeprr>ikkrgifpkvatnimrawlfqhl 
shpypseeqkkqlaqdtgltilqvnnwfinarrr 
ivqpmidqsnrtgqgaafspegqpiggyteteph 
vafra;pasvgmslnsegewhyl 


3045 


A 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHACQV 

LILKHTHASLSLPSCQECFPSSIPSASHMVSHPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIQ 

KTYDLTRYLEHQLRSLAGTYLNYLGPPFNEPDFN 

PPRLGAETLPRATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWLLKELQTWLWRSAKDFNRLKK 

KMQPPAAAVTLHLGAHGF 


3046 


A 


1185 


1584 


MYAYMYICTHICICAYRGIHIDVYLYMCIYIHIWI 
HTYLCVHIYVYVYICTHICMCIHTYVYVYTYMY 
VYTYICLCVYICLCVHIYLCVYIHMYMCTHICMC 
IHTYVHMCICVYIHMYTCVYVYTYTCVYMY 


3047 


A 


811 


132 


SLDLLGPIGILQEGRDPGTQGPQEKEKQMPASPM 
NTDAHLDINFKEGLKKERSYTGQFEANVRDEER 
QCGCGVVPDSLLMKVLSQRLDQQDCIQKGWVL 
HGVPRDLDQAHLLNRLGYNPNREFFLNVPFDSI 

MERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 
QNPKDAEEQVKLKMDLFYRNSADLEQLYGSAIT 
LNGDQDPYTVFEYIESGIINPLPKKIP 


3048 


A 


2 


U66 


RPRRGQGLVQEVQTENVTVAEGGVAEITCRLHQ 
YDGSIWIQNPARQTLFFNGTRALKDERFQLEEFS 
PRRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
LTVLVAPENPVVEVREQAVEGGEVELSCLVPRSR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

loc&tion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E'^GIutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=l5oleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q^GIutamine, R=Arginine4 S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\spo$sible nucleotide insertion 










PAATLRWYRDRKELKGVSSSQENGKVWSVAST 

VRFRVDRKDDGGUICEAQNQALPSGHSKQTQYV 

LDVQYSPTARIHASQAWREGDTLVLTCAVTGN 

PRPNQIRWNRGNESLPERAEAVGETLTLPGLVSA 

DNGTYTCEASNKHGHARALYVLWYGESRLRPT 

EGGGGAPDPGAVVEAQTSVPYAIVGGILALLVFL 

IICVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 

REAFLNGSDGHKRKEEFFI 


3049 


A 


3159 


882 


VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 

DSESESESENSPQAETREAREAARSPDKPGGSPSA 

SRRJCGRASEHKDQLSRLKDRDPEFYKFLQENDQ 

SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 

EEGEDGDRVPRGLKGKKNSVPVTVAMVERWKQ 

AAKQRLTPKLFHE WQAFRA AVATTRGDQES AE 

ANKFQVTDSAAFNALVTFCIRDLIGCLQKLLFGK 

VAKDSSRMLQPSSSPLWGKLRVDnCAYLGSAIQL 

VSCLSETTVLAAVLRHISVLVPCFLTFPKQCRML 

LKRMVVVWSTGEESLRVLAFLVLSRVCRHKKDT 

FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 

LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR 

KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 

LQPLVYPLAQVIIGCIKLIPTARFYPLRMHCIRALT 

LLSGSSGAFIPVLPFILEMFQQVDFNRKPGRMSSK 

PIWSVILKLS>rVNLQEKAYRDGLVEQLYDLTLE 

YLHSQAHCIGFPELVLPVVLQLKSFLRECKVANY 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKLTREEGTPLTLYYSHWRKLRDREIQL 

EISGKERLEDLNFPEIKRRKMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 

GPEDELEDLQLSEDD 


3050 


A 


870 


182 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTM 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGCGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFGWLVHFSSEEVDMASDSPARS 

LDEHDLSALRDPAGIFELVELVGNGTYGQVYKGR 

HVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKY 

SHHRNIATYYGAFIKKNPPGMDDQLWLVMEFCG 

AGSVTDLIKNTKGYTLKEEWIAYICREILRGLSHL 

HQHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQ 

LDRTVGRRNTFIGTPYWMAPEYIACDENPDATY 

DFKSDLWSLGITAffiMAEGAPPLCDMHPMRALF 

LDPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRP 

ATEQLMKHPFIRDQPNERQVRIQLKDHIDRTKKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSDLNLPGE 

STLRRDFLRLQLANKERSEALRRQQLEQQQREN 

EEHKRQLLAERQKRIEEQKEQRRRLEEQQRREKE 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 

RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 

EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 

KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C^Cysteine, D=A$partic Acid, 
E-Glutamic Acid, F=Phenylalanine, G*<;iycine, H^Histidine, 
lalsoleudne, K=Lysine, I;=Lencine, M=Melliionine, 
N=Asparagine, P=Proiinc Q=Glutain!ne, R=Arginine, S^Serine, 
T-TIireonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unluiown, *°^top codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










SPAMPHKVANRISDPNLPPRSESFSISGVQPARTP 

PMLRPVDPQIPHLVAVKSQGPALTASQSVHEQPT 

KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 

TRIEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 

KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 

PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 

RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 

RPASYKKAIDEDLTALAKELRELRIEETNRPNIKK 

VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 

PRLIPTGAPGSNEQYNVGMVGTHGLETSHADSFS 

GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 

PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 

YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 

SAAALFTSELLRQEQAKLNEARKISVVNVNPTNI 

RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 

ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 

LNVLVTISGKKNKLRVYYLSWLRNRILHNDPEV 

EKKQGWITVGDLEGCIHYKVVKYERIKFLVIALK 

NAVEIYAWAPKPYHKFMAFKSFADLQHKPLLVD 

LTVEEGQRLKVIFGSHTGFHVIDVDSGNSYDIYIP 

SHIQGNITPHAIVILPKTDGMEMLVCYEDEGVYV 

NTYGRITKDVVLQWGEMPTSVAYIHSNQIMGW 

GEKAIEIRSVETGHLDGVFMHKRAQRLKFLCEKN 

DKVFFASVRSGGSSQVFFMTLNKNSMMNW 


3052 


A 


1 


615 


MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE 

KLEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 

GGPGALLGPKPKLKGSLGTGAEEGAPVTAGVTA 

PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 

RSQIAHALKLSEVQVKIWFQNRRAKWKRIKAGN 

VSSRSGEPVRNPKIWPIPVHVNRFAVRSQHQQM 

EQGARP 


3053 


A 


203 


2167 


FGVRVPSNTQCLVPSFHCMQTSEWDSECLTSLQP 

LPLPTPPAANEAHLQTAAISLWTVVAAVQAffiRK 

VEmSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENLLRNR 

NFWILRLPPGIKGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEffTDPSEEPGISTS 

DILSWIKQEEEPQVGAPPESKESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

KISLLLHQRGHAQERPFSCPQCGIDFNGHSALIRH 

QMIHTGERPYPCTDCSKSFMRKEHLLNHRRLHT 

GERPFSCPHCGKSFIRKHHLMKHQRIHTGERPYP 

CSYCGRSFRYKQTLKDHLRSGHNGGCGGDSDPS 

GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 

GSGGGVL 


3054 


A 


3 


2212 


SCGHKSAYGSYTGLQLFWEDGQELLQHQQLQD 
LRLCVHLRPQSEKVELSLWTLFWGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPLLYFF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C'=Cysteine, D==Aspartic Acid, 
E=Glutamic Acid, F===Phcnyla!aniDe, G=Glycine, H^Histidine, 
I^Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProHne, Q=Glutamine, R=Argininc, S^Scrine, 
T=Threonine, V=Vallne, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A^possible nucleotide deletion, 
\=possibIe nucJeotide insertion 










APWARASFLCHAFQRPLTGIGLNTVRFTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDILHLAEHQTTHPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEGVVDFHIALRHNKCCESGDAF 

NNKSTLVQHQRIHSRERPYECSKCGIFFTYAADL 

TQHQKVHNRGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRSSNLIQHKRVHTGEK 

PYECSDCGKFFSQRSNLIHHKRVIjrrGRSAHECSE 

CGKSFNCNSSLIKHWRVHTGERPYKCNECGKFFS 

HIASLIQHQIVHTGERPHGCGECGKAFIRSSDLMK 

HQRVHTGERPYECNECGKLFSQSSSLNSHRRLHT 

GERPYQCSECGKFFNQSSSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHKPDRPYECSECG 

KAFNQRPTLIRHQKIHIRERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVPIPSTHPGEVP 


3055 


A 


268 


2954 


ARRSSSSQGSAAPTPCQVVEASRDQLVAGPSGK 

MGNREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRVTGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQFITRENCLILA 

VTPANTDLANSDALKLAKEVDPQGLRTIGVITKL 

DLMDEGTDARDVLENKLLPLRRGYVGVVNRSQ 

KDIDGKKDIKAAMLAERKFFLSHPAYRHIADRM 

GTPHLQKVLNQQLTNHIRDTLPNFRNKLQGQLLS 

EEHEVEAYKNFKPEDPTRKTKALLQMVQQFAVD 

FEKRIEGSGDQVDTLELSGGAKINRIFHERFPFEIV 

KMEFNEKELRREISYAIKNIHGIRTGLFTPDMAFE 

AIVKKQIVKLKGPSLKSVDLVIQELINTVKKCTK 

KLANFPRLCEETERIVANHIREREGKTKDQVLLLI 

DIQVSYINTNHEDFIGFANAQQRSSQVHKKTTVG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HIFALFNTEQRNVYKDYRFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

MDPQLERQVETIRNLVDSYMSIINKCIRDLIPKTI 

MHLMINNVKDFINSELLAQLYSSEDQNTLMEES 

AEQAQRRDEMLRMYQALKEALGIIGDIGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPtTQRRPTLSAPL 

ARPTSGRGPAPAIPSPGPHSGAPPVPFRPGPLPPFP 

SSSDSFGAPPQVPSRPTRAPPSVPSRRPPPSPTRPTI 

IRPLESSLLD 


3056 


A 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3057 


A 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTlvrYFMCEFDKKNM 


3058 


A 


3363 


2525 


FLVKLILIILCRCLHSLSRSVQQLRTSFQDHAVWK 

PLMKVLQNAPDEILWASSMLCNLLLEFSPSKEPI 

LESGAVELLCGLTQSENPALRVNGIWALMNMAF 

QAEQKIKADILRSLSTEQLFRLLSDSDLNVLMKT 

LGLLRNLLSTRPHIDKIMSTHGKQIMQAVTLILEG 

EHNIEVKEQTLCILANIADGTTAKDLIMTNDDILQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

nPTittdc 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 


Amino acid sequence (A=Alanine OCysteine, I>=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G^Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, l>=Leucine, M=Methioninc, 
N=Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KIKYYMGHSHVKLQLAAMFCISNLIWNEEEGSQ 
ERQDKLRDMGIVDILHKLSQSPDSNLCDKAKMA 
LQQYLA 


3059 


A 


679 


167 


SSWPSLSSQMHFPSFHLHVAAHYGRDSFVRLLLE 

FKAEVDPLSDKGnTLQLAHRERSSCVKILLDHN 

ANroiQNGFLLRYAVIKSlffl[SYCRI^LQRGA^ 

LGRLEDGQTPLHLSALRDDVLCARMLYNYGAD 

TNTRNYEGQTPLAVSISISGSSRPCLDFLQEVTSM 


3060 


A 


30 


234 


PPLQLDMDPNCYCADGDSCTCAGSCKCKECKCT 

SCKKSCCSCCPAGCAKCAQGCICKGATDKCSCC 

A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAIIVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAWSWfKRPKXETKK^ 


3062 


A 


1589 


276 


WKQKYEPLGLDAAGIEEAITAVGSFILKANELLQ 

VIDSSMKNFKAFFRWLYVAMLRMTEDHVLPELN 

KMTQKDITFVAEFLTEHFNEAPDLYNRKGKYFN 

VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 

SSHLKESPLLFPYYPRKSLHFVKRRMENIIDQCLQ 

KPADVIGKSMNQAICIPLYRDTRSEDSTRRLFKFP 

FLWNNKTSNLHYLLFTILEDSLYKMCILRRHTDIS 

QSVSNGLIAIKFGSFTYATTEKVRRSIYSCLDAQF 

YDDETVTWLKDTVGREGRDRLLVQLPLSLVYN 

SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 

WRLLESMKAQYVAGNGFRKVSCVLSSNLRHVR 

VFEMDIDDEWELDESSDEEEEASNKPVKIKEEVL 

SESEAENQQAGAAALAPEIVIKVEKLDPELDS 


3063 


A 


50 


849 


DKMPSJFAYQSSEVDWCESNFQYSELVAEFYNTF 

SNIPFFIFGPLMMLLMHPYAQKRSRYIYVVWVLF 

MnGLFSMYFHMTLSFLGQLLDEIAILWLLGSGYS 

IWMPRCYFPSFLGGNRSQFIRLVFITTVVSTLLSFL 

RPTVNAYALNSIALHILYIVCQEYRKTSNKELRH 

LIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYL 

HS1WHVLISITFPYGMVTMALVDA>IYEMPGETL 

KVRYWPRDSWPVGLPYVEIRGDDKDC 


3064 


A 


1523 


925 


AATMADGQMPFSCHYPSRLRRDPFRDSPLSSRLL 

DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVNTVHSFKPEELMVKTKDGYVEVSGKHEEKQ 

QEGGIVSKNFTKKIQLPAEVDPVTVFASLSPEGLL 

IIEAPQVPPYSTFGESSFNNELPQDSQEVTCT 


3065 


A 


230 


2929 


LSTSLTGSHLFSLGNHSTRENLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMVAQCVSPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGRIVPLDSEDSLS 

FVKTACMAVYDIPDLLGGNGCLGSVVFSESFLTS 

QILVKEKDGTVTTETSSVVLTAAVPRFCSWLVED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSNLQSWPEEGNVHFFSSGLLFSHCRHGSniSKD 

HMNSISFYDGDSTSTVAALLIDFKSSLLPHLPVHF 

HGSSNFLMIALFPKSKIYQAFYSEVFSLWKQQDN 

SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 

GEKRSSLKLLSAKLPELDWFLQHFAISSISQEPVM 

RTHLPVLLQQAEmTTHRIESDKVIISIVTGLPGCH 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine C^ysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=PhenylaIantne, G'^GIycinefH-Histidine, 
I=]soleucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P«Proline) Q=GIutamine, R=Arginine, S^^erine, 
T=Threonine, y=Valine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=po5Sible nucleotide deletion, 
\Bpossible nucleotide insertion 










ASELCAFLVTLHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQGYTDVIDVVQALQTHPDSNVKASFnGATTA 

CVEPMSCYMEHRFLFPKCLDQCSQGLVSNVVFT 

SHTTEQRHPLLVQLQSLIRAANPAAAFILAENGIV 

TRNEDIELILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

IQSSIKPSPFSGNIYHILGKVKFSDSERTMEVCYNT 

LANSLSIMPVLEGPTPPPDSKSVSQDSSGQQECYL 

VFIGCSLKEDSIKDWLRQSAKQKPQRKALKTRG 

MLTQQEIRSIHVKRHLEPLPAGYFYNGTQFVNFF 

GDKTDFHPLMDQFMNDYVEEANREIEKYNQELE 

QQEYHDLFELKP 


3066 


A 


130 


588 


. LAPLRCQPGTRTQPRSHPAANDPSAAMS A AG AR 
GLRATYHRLLDKVELMLPEKLRPLYNHPAGPRT 
VFFWAPIMKWGLVCAGLADMARPAEKLSTAQS 
AVLMATGFIWSRYSLVIIPKNWSLFAVNFFVGAA 
GASQLFRIWRYNQELKAKAHK 


3067 


A 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYPAG 

SLLRQSPQPRHTFYAGPRLSASASSKELLMKLRR 

KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ 

KEGWSKAAKLQGRKTKEGLIGLLQEGNTTVLVE 

VNCETDFVSKNLKFQLLVQQVALGTMMHCQTL 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGENMILKRAAWVKVPSGFYVGSYVHG 

AMQSPSLHKLVLGKYGALVICETSEQKTNLEDV 

GRRLGQHWGMAPLSVGSLDDEPGGEAETKML 

SQPYLLDPSITLGQYVQPQGVSWDFVRFECGEG 

EEAAETE 


3068 


A 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNKNSKELGL 

VPLTDDTSHAGPPGPGRALLECDHLRSGVPGGR 

RRKDWSCSLLVASLAGAFGSSFLYGYNLSWNA 

PTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 

SIFAIGGLVGTLIVKMIGKVLGRXHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRFIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFICIGVFTGQLL 

GLPELLGKESTWPYLFGVIWPAWQLLSLPFLP 

DSPRYLLLEKHNEARAVKAFQTFLGKADVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQVVT 

VIVTMACYQLCGLNAIWFYTNSIFGKAGIPPAKIP 

YVTLSTGGIETLAAVFSGLVIEHLGRRPLLIGGFG 

LMGLFFGTLTITLTLQDHAPWVPYLSIVGILAIIAS 

FCSGPGGIPFILTGEFFQQSQRPAAFIIAGTVNWLS 

NFAVGLLFPFIQKSLDTYCFLVFAHCITGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEKIDSAV 

TDGKINGRP 


3069 


A 


86J 


300 


AAGAVVSAMPKAKGKTRRQKFGYSVNRKRLNR 

NARRKAAPRIECSHIRHAWDHAKSVRQNLAEMG 

LAVDPNRAVPLRKRKVTCAMEVDIEERPKELVRK 

PYVLNDLEAEASLPEKKGNTLSRDLIDYVRYMV 

ENHGEDYKAMARDEKNYYQDTPKQIRSKINVY 

KRFYPAEWQDFLDSLQKRKMEVE 


3070 


A 


325 


2019 


LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPHAMADSRDPASDQMQHWKEQRAAQKADV 
LTTGAGNPVGDKLNVITVGPRGPLLVQDWFTD 
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SEQm 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lUCdllUll 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

rorr^cnnnHinc 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=MethioDiDe, 

N=AQnarnoinp f^Pr<ilin£^ 0=01ut9minp Rs^Arvininp JQcvS^rinp. 

T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\-possible nucleotide insertion 










EMAHFDRERIPERWHAKGAGAFGYFEVTHDIT 

KYSKAKVFEfflGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVKFYTEDGNWDLVGNNTPIFFIRDPILF 

PSFfflSQKRNPQTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGEPDGHRHMNGYGSHTFKLVNANG 

BAVYCKFHYKTDQGIKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLIPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVANYQRDGPMC 

MODNOGGAPNYYPNSFGAPEOOPSALEHSIOYS 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRKR 

LCENIAGHLKDAQIFIQKKAVKNFTEVHPDYGSH 

IQALLDKYNAEKPKNAIHTFVQSGSHLAAREKA 

NL 


3071 


A 


1 


1187 


SLGWLERPPALSRAAGDGARRLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 

GSVVQDSRLDTIFFAKQVINNACATQAIVSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVIRQVHNSFARQQMFEFDTKTSAKEEDAFHF 

VSYVPVNGRLYELDGLREGPIDLGACNODDWIS 

AVRPVIEKRIQKYSEGEIRFNLMAIVSDRKMIYEQ 

KIAELQRQLAEEEPMDTDQGNSMLSAIQSEVAK 

NQMLIEEEVQKLKRYKIENIRRKHNYLPFIMELL 

KTLAEHQQLIPLVEKAKEKQNAKKAQETK 


3072 


A 


103 


2775 


RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLfflSG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAPGLRSPV 

WAAGLGGGGRREPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDKKNA 

SSRPASAISGQNNNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERKKRLEEQRQKEERRRAAVEEKRRQRLEED 

KERHEAWRRTMERSQKPKQKHNRWSWGGSLH 

GSPSfflSADPDRRSVSTMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSVVNRLLTPTHSF 

LARSKSTAALSGEAVIPICPRSASCSPUMPYKAAH 

SRNSMDRPKLFVTPPEGSSRRRnHGTASYKKERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

IRPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKRL 

EEIMKRTRRTEATDKKTSDORNGDIAKGALTGG 

TEVSALPCTTNAPGNGKPVGSPHVVTSHQSKVT 

VESTPDLEKQPNENGVSVQNENFEEIINLPIGSKP 

SRLDVTNSESPEIPLNPILAFDDEGTLGPLPQVDG 

VQTQQTAEVI 


3073 


A 


67 


2415 


PPRVCRDHVCLICWDPIAGTGGSRSTMPALPLDQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino- 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=Phenylatanine, G^GIycine, H=Hi$tidine, 
I=Isoleucine, K^^Lysine, L^Leucine, M=Mettiionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=TIireoDine, V=Valine, W"=Tryptophan, Y=Tyrosine, 
X=Unknown, *°°Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 










LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDNIPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVBFDETLQKCLDSYLRYVPRKFDEGVAS 

APEWDMQKRLHRSVFLTFLRMSTHKESKDHFIS 

PSAFGEILYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMIGNBFTQQPSYYSDLDETLPTILQVFSNILQHC 

GLQGDGANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFLDIFPLACQTFQKHDFCYRLA 

SFYEAAIPEMESAKKRRLEDSKLLGDLWQRLSH 

SRKKLMEIFHIILNQICLLPILESSCDNIQGFIEEFL 

QIFSSLLQEKRFLRDYDALFPVAEDISLLQQASSV 

LDETRTAYILQAVESAWEGVDRJRKATDAKDPSV 

lEEPNGEPNGVTVTAEAVSQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

LACLEYYHYDPEQVINNILEERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRA VAAQRQRYE 

QYSWVEEVPLQPGESLPYHSVYYEDEYDDTYD 

GNQVGANDADSDDELISRRPFTIPQVLRTKVPRE 

GQEEDDDDEEDDADEEAPKPDHFVQDPAVLREK 

AEARRMAFLAKKGYRHDSSTAVAGSPRGHGQS 

RETTQERRKKEANKATRANHNRRTMADRKRSK 

GMPS 


3074 


A 


3 


251 


GBARSPPPAAALLDMDPETCPCPSGGSCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 


3075 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLEWELKEEEKKKECAAROEDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADEDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3076 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLEWELKEEEKKKECAARGEDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADDDYINERNAKFNKKAERFYGKYTAEl 

KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVAIYTGGATGIGKAIVKELLELGSNWI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

NEEEVNNLVKSTLDTFGKINFLVNNGGGQFLSPA 

EfflSSKGWHAVLETNLTGTFYMCKAVYSSWMK 

KHGGSrVNirVPTKAGFPLAVHSGAARAGVYNLT 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 

GQSFFEGSFQKIPAKRIGVPEEVSSWCFLLSPAA 

SFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGA 

GDLSWKKMKETFKEKAKL 


3078 


A 


2 


3508 


FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 



244 



wo 01/57190 PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^^Alanine OCysteine, D»Aspartic Acid, 
E=Glutaniic Acid, F=P)ienyIalanine, G<;iycine» H=Histidlne, 
I»Isoleucine, K=Lysine, L^Leucine, M-Methionine, 
N~Asparagine, Ps^Proline, Q=Glutamine, R=Arginine, S^Serine, 
T»Threonine, V^Valinc, W=Tryptophan, Y«Tyrosinc, 
X=Unknown, *'=Stop codon.A^possible nucleotide deletion, 
V=pos5ible nucleotide insertion 










GQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREY 

PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 

CVNALAARDPIWAARFRSIRDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DGAIPAMYLDCISDLRQKEITDGIHSSSDINILYN 

DAVESCIQDPSAEGLSEEVPVVFEELPWFEDVA 

VYFTREEWGMLDKRQKELYRDVMRMNYELLAS 

LGPAAAKPDLISKLERRAAPWIKDPNGPKWGKG 

RPPGNKKMVAVREADTQASAADSALLPGSPVEA 

RASCCSSSICEEGDGPRRIKRTYKPRSIQRSWFGQ 

FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 

YTGPFKVETLKYHEVSKAHRLCVNTVEIKEDTPH 

TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND 

FEKILQLLQSTGTVILGKYRNRTACTQFIKYISETL 

KREILEDVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVKESYITLAPLYSETADGYFETIVSALD 

ELDIPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 

QEVIPQLLPVHCVAHRLHLAWDACGSIDLVKK 

CDRHIRTVFKFYQSSNKRLNELQEGAAPLEQEIIR 

LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 

LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGffiYLQQRFDADRPPQLKN 

MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 

PTGYSEEALLEEWLGLKTIAQHLPFSMLCKNALA 

QHCRFPLLSKLMAWVCVPISTSCCERGFKAMN 

RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 

PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA 

RLRKEEMGALYVEEPRTQKPPILPSREAAEVLKD 

CIMEPPERLLYPHTSQEAPGMS 


3079 


A 


343 


1513 


FSPLEPRLCSLGGWGALQAGEPCQPSRAGCGRE 

GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 

KDVKLLLLGAGESGKSTIVKQMKIIHEDGFSGED 

VKQYKPVVYSNTIQSLAAIVRAMDTLGIEYGDK 

ERKADAKMVCDVVSRMEDTEPFSAELLSAMMR 

LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 

AADYQPTEQDILRTRVKTTGIVETHFTFKNLHFR 

LFDVGGQRSERKXWIHCFEDVTAIIFCVALSGYD 

QVLHEDETTNRMHESLKLFDSICNNKWFTDTSIl 

LFLNKKDIFEEKIKKSPLTICFPEYTGPSAFTEAVA 

YIQAQYESKNKSAHKEIYSHVTCATDTNNIQFVF 

DAVTDVHAKNLRGCGLY 


3080 


A 


41 


997 


EARTARELTDGVTDGLTMADQPKPISPLKNLLA 

GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP 

MYSGTFDCFRKTLFREGITGLYRGMAAPnGVTP 

MFAVCFFGFGLGKKLQQKHPEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

LDCAKKLYQEFGIRGIYKGTN^TLMRDVPASGM 

YFMTYEWLKNIFTPEGKRVSELSAPRILVAGGIA 

GIFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 

RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 

EVAMKFLNWATPNL 


3081 


A 


3 


1996 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E^'Glutamic Acid, F^Phenylalanine, G<=Glycine, H=Histidinc, 
I=Isoleucine, K=Lysine, L^Leucine, M=IV1ethionine, 
N^Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *=Stop codon, /^spossible nucleotide deletion, 
\=possible nucleotide insertion 










NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEI 

KESNARIVKWSDGSMSLHLGNEVFDYYKAPLQG 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

HRKMTLSLADRCSKTQKIRILPMAGRDPECQRTE 

MKKEEERLRASIRRESQQRJRMREKQHQRGLSAS 

YLEPDRYDEEEEGEESISLAAIKNRYKGGIREERA 

RIYSSDSDEGSEEDKAQRLLKAKKLTSDEVRPNL 

FNSRGLSCTQEPTALNEELTDQAGTN 


3082 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELVVRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCV VEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3083 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELVVRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGBRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNBCI 


3084 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

PIEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPPDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C*"Cysteine, D=Aspartlc Acid, 
£=Glutamic Acid, F^Phenylalanine, G=Glycine, H=Htstidioe, 
I=IsoIeucine, K^Lysine, Lr=Leucine, M=Methionine, 
N»Asparagine, P=Pro!ine, Q=Glutamlne, R»Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown» *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVK31ACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNUPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRAKNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


3085 


A 


128 


4050 


ksivkirkrmaaetqtlnfgpewlralssggsits 

pplspalpkykladyrygreemlalflkdnkips 

dlldkeflpilqeeplpplalvpfteeeqrnfsms 

vnsaavlrltgrggggtvvgaprgrsssrgrgr 

grgecgfyqrsfdevegvfgrgggremhrsqs 

weergdrrfekpgrkdvgrpnfeeggptsvgrk 

hefirsesenwrifreeqngededggwrlagsrr 

dgerwrphspdgprsagwrehmerrrrfefdfr 

drddergyrrvrsgsgsidddrdslpewcleda 

eeemgtfdssgaflslkkvqkepipeeqemdfrp 

vdegeecsdsegshneeakepdktnkkegektd 

rvgveaseetpqtssssarpgtpsdhqsqeAsqfe 

rkdepkteqtekaeeetrmenslpakvpsrgde 

mvadvqqplsqipsdtaspllilpppvpnpsptlrp 

vetpvvgapgmgsvstepddeeglkhleqqaek 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 

EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFmSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNUPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEICAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERJCRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amiiio 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OOysteine, D=Aspartic Acid, 
E=Glutamic Acid, F«Phenylalanine, G=Glycine, H»Histidine, 
I==Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonlne, V=Valine, W=Tryptophan, Y=Tyroslnc, 
X=Un known, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


30S6 


A 


675 


1334 


LHPAATSTAWLHVPPGLSMALSWVLTVLSLLPL 

LEAQIPLCANLVPVPITNATLDRITGKWFYIASAF 

RNEEYNKSVQEIQATFF YFTPNKTEDTIFLREYQT 

RQDQCIYNTTYLNVQRENGTISRYVGGQEHFAH 

LLILRDTKTYMLAFDVNDEKNWGLSVYADKPET 

TKEQLGEFYEALDCLRIPKSDVVYTDWKKDKCE 

PLEKQHEKERKQEEGES 


3087 


A 


1 


1575 


CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

VVRRCVKKTSTQEYAAKIINTKKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYYSEADASHCIHQILESVNHIHQ 

HDIVHRDLKPBNLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTINPAKRITA 

DQALKJHPWVCQRSTVASMMHRQETVECLRKFN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

VKPQSNNKNSLVSPAQEPAPLQTAMEPQTTVVH 

NATDGIKGSTESCNTTTEDEDLKVRKQEIIKITEQ 

LffiAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPIHTTILNPHVHVIGED 

AACIAYIRLTQYIDGQGRPRTSQSEETRVWHRRD 

GKWLNVHYHCSGAPAAPLQ 


3088 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 

DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVNnXADHVSPLHEACLGGHLSC 

VKILLKHGAQVNGVTADWHTPLFNACVSGSWD 

CVNLLLQHGASVQPESDLASPIHEAARRGHVEC 

VNSLIAYGGNIDHKISHLGTPLYLACENQQRACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 

LEREGPPSLMQLCRLRIRKCFGIQQHHKITKLVLP 

EDLKQFLLHL 


3089 


A 


73 


432 


DMAGLMTWTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFFVSKRIPENRVVSYQLSSRSTCLKAGVIFTTKK 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 


LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKDLKVCVEFDGESWRKRRWIEV 

YSLLRRAFLVEHNLVLAERKSPEISERIVQWPAIT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDKNLVGSEVKIYSLDPSTQWFSATVVNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=^AJanine OCysteine, D'^Aspartic Add, 
£=:Glutamic Acid, F^Phenylalanine, G^^Glycine, H==Histidine, 
l-lsoleucine, K=Lysine, L^Leucine, M==Methionine, 
N^^Asparagine, P=ProUne, Q=Giutamine, R=Arginine, $=5eriiie, 
Ts^Thrconinc, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X^Unknown, *=Stop.codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQAANSPPNLGAKIPQGCHKQSLPEEISSCLNTKS 

EALRTKPDVCKAGLLSKSSQIGTGDLKILTEPKGS 

CTQPKTNTDQENRLESVPQALTGLPKECLPTKAS 

SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCCSRSNNKIQNAPSRKSVLTDPAKLKKLQ 

QSGEAFVQDDSCVNIVAQLPKCRECRLDSLRKD 

KEQQKDSPVFCRFFHFRRLQFNKHGVLRVEGFLT 

PNKYDNEAIGLWLPLTKNVVGIDLDTAKYILANI 

GDHFCQMVISEKEAMSTffiPHRQVAWKRAVKG 

VREMCDVCDTTIFNLHWVCPRCGFGVCVDCYR 

MKRKNCQQGAAYKTFSWLKCVKSQIHEPENLM 

PTQIIPGKALYDVGDIVHSVRAKWGIKANCPCSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASBCPAGSMKPACPASTSPLN 

WLADLTSGlSm^^KENKEKQPTMPILKNEIK^^ 

PPLSKSSTVLHTFNSTILTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPKILDDIFASLVQNKTTSDLSKR 

PQGLTIKPSILGFDTPHYWLCDNRLLCLQDPNNK 

SNWNVFRECWKQGQPVMVSGVHHKLNSELWK 

PESFRKEFGEQEVDLVNCRTNEIITGATVGDFWD 

GFEDVPNRLKNEKEPMVLKLKDWPPGEDFRDM 

MPSRFDDLMANIPLPEYTRRDGKLNLASRLPNYF 

VRPDLGPKMYNAYGLITPEDRKYGTTNLHLDVS 

DAANVMVYVGDPKGQCEQEEEVLKTIQDGDSDE 

LTIKRFIEGKEKPGALWHIYAAKDTEKIREFLKK 

VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 

EYGVQGWAIVQFLGDVVFIPAGAPHQVHNLYSC 

IKVAEDFVSPEHVKHCFWLTQEFRYLSQTHTNHE 

DKLQVKNVIYHAVKDAVAMLKASESSFGKP 


3091. 


A 


97 


1838 


KRGARRGGWKRKMPSTDLLMLKAFEPYLEILEV 

YSTKAKNYVNGHCTKYEPWQLIAWSWWTLLI 

VWGYEFVFQPESLWSRFKKKCFKLTRKMPIIGRK 

IQDKLNKTKDDISK>JMSFLKVDKEYVKALPSQG 

LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 

EKLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 

VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DLAFEKGIKTPEIVAPQSAHAAFNKAASYFGMKI 

VRVPLTKMMEVDVRAMRRAISRNTANlLVeSTP 

QFPHGVIDPVPEVAKLAVKYKIPLHVDACLGGFL 

IVFMEKAGYPLEHPFDFRVKGVTSISADTHKYGY 

APKGSSLVLYSDKKYRNYQFFVDTDWQGGIYAS 

PTIAGSRPGGISAACWAALMHFGENGYVEATKQI 

IKTARFLKSELENIKGIFVFGNPQLSVIALGSRDFD 

lYRLSNLMTAKGWNLNQLQFPPSIHFCITLLHAR 

KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 

MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSQ 

MNGSPKPH 


3092 


A 


79 


2652 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSVPEETDKGSFVGNIAKDLGLQPQELADGGVRI 

VSRGRMPLFALNPRSGSLITARRIDREELCAQSM 

PCLVSFNILVEDKMKLFPVEVEIIDINDNTPQFQL 



249 



wo 01/57190 



PCTAJSOl/04098 



SCQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inrjitinn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^^Alanine C=CysteiQe, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G^'Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P~Froline, Q=Glutaniine, R=Arginine, S^^Serine, 
T=Thrconine, V=Vannc, W=Tryptophan, Y=Tyrosine, 
X-linknown, *>=Stop codon, /^possible nucleotide deletion, 
V=niossible nucleotide insertion 










EELEFKMNEITTPGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EBEAVHHLDLTASDGGEPVRSGTLRIYIQVVDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNATDP 

DEGANGEVTYSFHNVDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI 

KVLDVNDNAPEVTITSVTTAVPENFPPGTIIALISV 

HDQDSGDNGYTTCFIPGNLPFKLEKLVDNYYRL 

VTERTLDRELISGYNITITAIDQGTPALSTETHISL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQITYSLIEDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEILYPALPTDGSTGVEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KASEPGLFSVGLHTGEVRTARALLDRDALKQSL 

WAVQDHGQPPLSATVTLTVAVADRIPDILADLG 

SLEPSAKPNDSDLTLYLVVAEAAVSCVFLAFVIV 

LLAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 

DGVRAFLQTYSHEVSLTADSRKSHLEFPQPNYAD 

TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 

ECISYLEKNNS 


3093 


A 


1 


3868 


PPDNQKLGLLEALLKIGDWQHAQNIMDQMPPYY 

AASHKLIALAICKLIHITIEPLYRSVTSWAVDHAG 

FLESDPCDSTVGHLLSRVGVPKGAKGSPVNALQ 

NKRAPKQAESFEDLRRDVFNMFCYLGPHLSHDPI 

LFAKVVRIGKSFMKEFQSDGSKQEDKEKTEVILS 

CLLSITDQVLLPSLSLMDCNACMSEELWGMFKT 

FPYQHRYRLYGQWKNETYNSHPLLVKVKAQTID 

RAKYIMKRLTKENVKPSGRQIGKLSHSNPTILFD 

YVCFEILSQIQKYDNLITPVVDSLKYLTSLNYDVL 

AClLSNCIIEALANPEKERMKHDDTnSSWLQSLA 

SFCGAVFRKYPIDLAGLLQYVANQLKAGKSFDL 

LILKEVVQKMAGIEITEEMTMEQLEAMTGGEQL 

KAEGGYFGQIRNTKKSSQRLKDALLDHDLALPL 

CLLMAQQRNGVIFQEGGEKHLKLVGKLYDQCH 

DTLVQFGGFLASNLSTEDYIKRVPSIDVLCNEFHT 

PHDAAFFLSRPMYAHfflSSKYDELKKSEKGSKQ 

QHKVHKYITSCEMVMAPVHEAVVSLHVSKVWD 

DISPQFYATFWSLTMYDLAVPHTSYEREVNKLK 

VQMKAIDDNQEMPPNKKKKEKERCTALQDKLL 

EEEKKQMEHVQRVLQRLKLEKDNWLLAKSTKN 

ETITKFLQLCIFPRCIFSAIDAVYCARFVELVHQQ 

KTPNFSTLLCYDRVFSDIIYTVASCTENEASRYGR 

FLCCMLETVTRWHSDRATYEKECGNYPGFLTIL 

RATGFDGGNKADQLDYENFRHWHKWHYKLT 

KASVHCLETGEYTHIRNILIVLTKILPWYPKVLNL 

GQALERRVHKICQEEKEKRPDLYALAMGYSGQL 

KSRKSYMIPENEFHHKDPPPRNAVASVQNGPGG 

GPSSSSIGSASKSDESSTEETDKSRERSQCGVKAV 

NKASSTTPKGNSSNGNSGSNSNKAVKENDKEKG 

KEKEKEKKEKTPATTPEARVLGKDGKEKPKEER 

PNKDEKARETKERTPKSDKEKEKFKKEEKAKDE 

KFKTTVPNAESKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDIPEPEREQKR 

RKIDTHPSPSHSSTVKDSLIELKESSAKLYINHTPP 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=Alanine C'^Cystelne, D=Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L==Leucine, M=Metbionine, 
N=Asparagine, P=ProUne, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










PLSKSKERENIDKKDLDKSRERSREREKKDEKDR 

KERKRDHSNNDREVPPDLTKRRKEENGmGVSK 

HKSESPCESPYPNEKDKEKNKSKSSGKEKGSDSF 

KSEKMDKISSGGKKESRHDKEKIEKKEKRDSSGG 

KEEKKHHKSSDKHR 


3094 


A 


2 


891 


AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 
PPYLHLAELTASQFLEIWKHFDADGNGYffiGKEL 
ENFFQELEKARKGSGMMSKSDNFGEKMKEFMQ 
KYDKNSDGKIEMAELAQILPTEENFLLCFRQHVG 
SSAEFMEAWRKYDTDRSGYIEANELKGFLSDLL 
KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 
SEMSRLLPVQENFLLKFQGMKLTSEEFNAIFTFY 
DKDRSGYIDEHELDALLKDLYEKNKKEMNIQQL 
. TNYRKS VMSLAEAGKLYRKDLEIVLCSEPPM 


3095 


A 


1685 


700 


RRPTGRPGALGAPAAGRVGMPLHVKWPFPAVPP 

LTWTLASSVVMGLVGTYSCFWTKYMNHLTVHN 

REVLYELIEKRGPATPLITVSNHQSCMDDPHLWG 

ILKLRHIWNLKLMRWTPAAADICFTKELHSHFFS 

LGKCVPVCRGAEFFQAENEGKGVLDTGRHMPG 

AGKRREKGDGVYQKGMDFILEKLNHGDWVHIF 

PEGKVNMSSEFLRFKWGIGRLIAECHLNPIILPLW 

HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 


3096 


A 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM 

EAQPEWLRAEVKRLSHELAETTREKIQAAEYGL 

AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 

GQAHTNHKKVAADGESREESLIQESASKEQYYV 

RKVLELQTELKQLRNVLTNTQSENERLASVAQE 

LKEINQNVEIQRGRLRDDIKEYKFREARLLQDYS 

ELEEENISLQKQVSVLRQNQVEFEGLKHEIKRLE 

EETEYLNSQLEDAIRLKEISERQLEEALETLKTER 

EQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKF 

SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 

TPKKEGLAPPSPSLVSDLLSELNISEIQKLKQQLM 

QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 

VTRLTENLSALRRLQASKERQTALDNEKDRDSH 

EDGDYYEVDINGPEILACKYHVAVAEAGELREQ 

LKALRSTHEAREAQHAEEKGRYEAEGQALTEKV 

SLLEKASRQDRELLARLEKELKKVSDVAGETQG 

SLSVAQDELVTFSEELANLYHHVCMCNNETPNR 

VMLDYYREGQGGAGRTSPGGRTSPEARGRRSPI 

LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 

PRREPMNIYNLIAIIRDQIKHLQAAVDRTTELSRQ 

RIASQELGPAVDKDKEALMEEELKLKSLLSTKRE 

QITTLRTVLKANKQTAEVALANLKSKYENEKAM 

VTETMMKLRNELKALKEDAATFSSLRAMFATRC 

DEYITQLDEMQRQLAAAEDEKKTLNSLLRMAIQ 

QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 

TPSVSHTCACASDRAEGTGLANQVFCSEKHSIYC 

D 


3097 


A 


1 


879 


MVKVVPATRGNLPRSQLTGTHQHCQPREPKITA 
SERLRRRPRATARLRAHAAPPEPPLAVFAPPSDR 
KELLALPVACDPVIASVMSWVQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNMQRRVKEGYRD 



251 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartlc Acid, 
E^Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Ly$ine, L/=Leucine, M-Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=5erine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-Un known, *=Stopcodon, ^possible nucleotide deletion, 
\»possible nucleotide insertion 










GIDAGKAVTLQQGFNQGYKKGAEVILNYGRLRG 
TLSALLSWCHLHNNNSTLINKINNLLDAVGQCEE 
YVLKHLKSITPPSHWDLLDSIEDMDLCHVVPAE 
KKTOEAKDERLCENNAEFmNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 

DRRVLGLREWGRPASERECSLCQRLKRELNMGD 

VEKGKKIRMKCSQCHTVEKGGKHKTGPNLHGL 

FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 

ENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT 

NE 


3099 


A 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYDIK 

ALIGRGSFSRVVRVEHRATRQPYAIKMIETKYRE 

GREVCESELRVLRRVRHANIIQLVEVFETQERVY 

MVMELATGGELFDRIIAKGSFTERDATRVLQMV 

LDGVRYLHALGITHRDLKPENLLYYHPGTDSKIII 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

RKPYTNSVDMWALGVIAYILLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWVVSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 

RVRERELREL 


3100 


A , 


3 


1500 


ARAVNGRWYQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDIEEDKLGIPTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLDIVLKKVKHRLV 

ENMSSGTADALGLSRAILCNDGLVKRLEELERTA 

ELYKGMTEHTKNLLRAFYELSQTHRGNGIPQSC 

AFGDVFSVIGVREPQPAASEAFVKFADAHRSIEK 

FGIRLLKTIKPMLTDLNTYLNKAEPDTRLTIKKYL 

DVKFEYLSYCLKVKEMDDEEYSCIALGEPLYRV 

STGNYEYRLILRCRQEARARFSQMRKDVLEKME 

LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 

DADVFPIEVDLAHTTLAYGLNQEEFTDGEEEEEE 

EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


3101 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAIEAGFRHIDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCTTWEAMEKCKDAGLAKSIGVS 

NFNRRQLEI^NKPGLKYKPVCNQVECHPYFNR 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

VVLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPACDHLHKNESVLKAKAVVAFHRGNFREL 

YKILESHQFSPHNHPKLQQLWLKAHYVEAEKLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A=Alanine C=Cysteine, D^^'Aspartic Acid, 
£=GIutamic Acid, F=Phenylalanine, G=Glycine, H^^Histidine, 
l=lsoleucine, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G]utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
\»possible nucleotide insertion 










TQVSNWFKNRRQRDRAAEAKERENTENNNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS 


3103 


A 


111 


1582 


LVYSWGCHIMADNDTDRNQTEKLLKRVRELEQ 

EVQRLKKEQAKlSnfCBDSNIRENSSGAGKTKRAFD 

FSAHGRRHVALRIAYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTHDFRNL 

CKMDVANGVINFQRTILSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMEKPEIIDELLNIEKNPQKPQYSMAVEFPLVLY 

DCKFENVKWIYDQEAQEFNITHLQQLWANHAV 

KTHMLYSMLQGLDTVPVPCGIGPKMDGMTEWG 

NVKPSVIKQTSAFVEGVKMRTYKPLMDRPKCQG 

LESRIQHFVRRGPaEHPia.FHEEETKAKRDCNDT 

LEEDNTNLETPTKRVCVDTEIKSII 


3104 


A 


227 . 


1519 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 

IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 

KEYIPPLIWGKSGfflQTALYGKMGRVRSPHPYGH 

RKFITMSDGATSTFDLFEPLAEHCVGDDITMVICP 

GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLGA 

LPNIELTSPRMFTYGCTWEFGAMVNYIKKTYPLT 

QLVVVGFSLGGNIVCKYLGETQANQEKVLCCVS 

VCQGYSALRAQETFMQWDQCRRFYNFLMADN 

MKKIILSHRQALFGDHVKKPQSLEDTDLSRLYTA 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 

HRIYVPLMLVNAADDPLVHESLLTIPKSLSEKRE 

NVMFVLPLHGGHLGFFEGSVLFPEPLTWMDKLV 

VEYANAICQWERNKLQCSDTEQVEADLE 


3105 


A 


1 


1251 


MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAARAGEGFRYIKPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LFLFRELRDTALTRRWTKKIKVEFEELLQTKTA 

GRLLEGLSLRDVFLGETVPFIKTIRLVRPWPSAT 

GEPDGPEGEALPAACPEELAFEAEVEYNGGFHLA 

IDVDLVFGKSAYLFVKLSRVVGRLRLVFTRVPFT 

HWFFSFVEDPLIDFEVRSQFEGRPMPQLTSIIVNQ 

LKKIIKRKHTLPNYKIRFKPFFPYQTLQGFEEDEE 

HffllQQWALTEGRLKVTLLECSRLLIFGSYDREA 

NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 

WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 

VPLRQCPG 


3106 


A 


972 


468 


MAAAGAGRLRRVASALLLRSPRLPARELSAPAR 

LYHKKVVDHYENPRNVGSLDKTSKNVGTGLVG 

APACGDVMKLQIQVDEKGKIVDARFKTFGCGSA 

lASSSLATEWVKGKTVEEALTIKNTDIAKELCLPP 

VKLHCSMLAEDAIKAALADYKLKQEPKKGEAE 

KK 


3107 


A 


106 


1221 


TCQDVRSVFSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGIVFLETSERMEPPHLVSCS 
VESAAKIYPEWPWFFMKGLTDSTPMPSNSTYPA 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine« D=Aspartic Acid, 
£=Glutamic Acid, F=PhenylaIanine, G=Glycinc, H<=Histidlne, 
Msoleucine, K=Lysine, L^Leucine, M^Methionine, 
N=Asparagine,P='ProlIne, Q=Glutamine, R'^Ai^ginine, S=Senne, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X<=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










FSFLSAIDNVFLFPLDMKRLLEDTPLFSWYNQINA 

SAEIWWLHISSDASRLAnWKYGGIYMDTDVISIR 

PIPEENFLAAQASRYSSNGIFGFLPHHPFLWECME 

NFVEHYNSAIWGNQGPELMTRMLRVWCKLEDF 

QEVSDLRCLNISFLHPQRFYPISYREWRRYYEVW 

DTEPSFm^SYALHLWNHMNQEGRAVIRGSNTLV 

ENLYRKHCPRTYRDLIKGPEGSVTGELGPGNK 


3108 


A 


1612 


839 


EVALFCFEMAAGMYLEHYLDSIENLPFELQRNFQ 

LMRDLDQRTEDLKAEIDKLATEYMSSARSLSSEE 

KLALLKQIQEAYGKCKEFGDDKVQLAMQTYEM 

VDKHIRRLDTDLARFEADLKEKQIESSDYDSSSS 

KGKKKGRTQKEKKAARARSKGKNSDEEAPKTA 

QKKLKLVRTSPEYGMPSVTFGSVHPSDVLDMPV 

DPNEPTYCLCHQVSYGEMIGCDNPDCSffiWFHFA 

CVGLTTKPRGKWFCPRCSQERKKK 


3109 


A 


1 


2613 


MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 

RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 

QMKRAQFGPHDWLSLPVPPGPSWLLVDTLEPET 

AYQFSVLAQNKLGTSAFSEVVTVNTLAFPITTPEP 

LVLVTPPRCLIANRTQQGVLLSWLPPANHSFPIDR 

YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 

WYEFRVLAVMQDLISEPSNIAGVSSTDIFPQPDLT 

EDGLARPVLAGIVATICFLAAAILFSTLAACFVNK 

QRKRKLKRKKDPPLSITHCRKSLESPLSSGKVSPE 

SIRTLRAPSESSDDQGQPAAKRMLSPTREKELSL 

YKKTKRAISSKKYSVAKAEAEAEATTPIELISRGP 

DGRFVMDPAEMEPSLKSRRIEGFPFAEETDMYPE 

FRQSDEENEDPLVPTSVAALKSQLTPLSSSQESYL 

PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 

PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 

SSVMSSPPLPTEGPFGHPTIPEENGENASNSTLPLT 

QTPTGGRSPEPWGRPEFPFGGLETPAMMFPHQLP 

PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGILSL 

EAPKGWAGKSPGRGPVPAPPAAKWQDRPMQPL 

VSQGQLRHTSQGMGIPVLPYPEPAEPGAHGGPST 

FGLDTRWYEPQPRPRPSPRQARRAEPSLHQVVLQ 

PSRLSPLTQSPLSSRTGSPELAARARPRPGLLQQA 

EMSEITLQPPAAVSFSRKSTPSTGSPSQSSRSGSPS 

YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 

QTPSPRRTGEELLRPETPPPTLPTLGKLRRDRPAP 

ATSPPERALSKL 


3110 


A 


88 


924 


ILGSRTMSLTNTKTGFSVKDILDLPDTNDEEGSV 

AEGPEEENEGPEPAKRAGPLGQGALDAVQSLPL 

KNPFYDSSDNPYTRWLASTEGLQYSLHGLAAGA 

PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

RKRRVLFSKAQTYELERRFRQQRYLSAPEREHLA 

SLIRLTPTQVKIWFQNHRYKMKRARAEKGMEVT 

PLPSPRRVAVPVLVRDGKPCHALKAQDLAAATF 

QAGIPFSAYSAQSLQHMQYNAQYSSASTPQYPT 

AHPLVQAQQWTW 


3111 


A 


595 


291 


PSVASLARRFSGRALWPPSHSVPGNRALCPRLLH 
GTTLPGGNQRELARQKNMKKQSDSVKGKRRDD 
GLSAAARKQRDSTPRDSEIMQQKQKKANEKKEE 
PK 


3112 


A 


3641 


1555 


APMLQIHHFSFKLIFQNIHKSKFISQRLSQNADST 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
loca tion 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C'^Cysteine, D==Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I^lsoleucine, K=Lysine, I;=Leucine, M=Metbionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X»Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










RHTNLSNTHYSDLIVWNCCLFFRNWCNEFFLKS 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRRFPV 

APLIPYPLITKEDINAIEMEEDKRDLISREISKFRDT 

HKKLEEEKGKKEKERQEIEKERRERERERERERE 

RREREREREREREREKEKERERERERDPIDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE 

REREKDKKRDREEDEEDAYERRKLERKLREKEA 

AYQERLKNWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEEIRQRLLAE 

GHPDPDAELQRMEQEAERHRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATPNTPGDESPCGIIIPHENSPDQQ 

QPEEHRPKIGLSLKLGASNSPGQPNSVKRKKLPV 

DSVFNKFEDEDSDDVPRKRKLVPLDYGEDDKNA 

TKGTVNTEEKRKHIKSLIEKIPTAKPELFAYPLDW 

SIVDSILMERRIRPWINKKIIEYIGEEEATLVDLVC 

SKVMAHSPPQSILDDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 


3113 


A 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPGLET 
NILKMTTPNKTPPGADPKQLERTGTVREIGSQAV 
WSLSSCKPGFGVDQLRDDNLETYWQSDGSQPHL 
VNIQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
GNNFHNLQEIRQLELVEPSGWIHVPLTDNHKKPT 
RTFMIQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRCTTIDFMMYRSIR 


3114 


A 


1 


1613 


MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELVVPGRDEGSRGALPGSSGVKF 

VWRKIVRFPVSDQVRTLSISRLMRRLLEMMQTL 

VQFnGWRSLLGRTLGTIMNTlVr^MMAQILRSH 

LIKATVIPNRVKMLPYFGIIRNRMMSTHKSKKKI 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIRQSKQQKITQAIERLVEDLIQESMAKGDF 

DNLSGKGKPLKKFSDCSYIDPMTHNLNRILIDNG 

YQPEWILKQKEISDTIEQLREAILVSRKKLGNPMT 

PTEKKQWNHVCEQFQENIRKLNKRJNDFNLIVPI 

LTRQKVHFDAQKEIVRAQKIYETLIKTKEVTDRN 

PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 


3115 


A 


1 


2036 


FRURCGCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPLCRMEIIRSNFKSNLHKVYQAIEEADFFAE)GE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCTFKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSIDFLASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDQKXFroQWEKIEDLLQSEENKNLDL 

EPCTGFQRKLIYQTLSWKYPKGIHVETLETEKKE 

RYIVISKVDEEERKRREQQKHAKEQEELNDAVG 

FSRVIHAIANSGKLVIGHNMLLDVMHTVHQFYC 

PLPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

IINNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQOO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=PhenyIaianine, G^GIycine, H=Histidine, 
I=l5oleucine, K=Lysioe, L^Leucine, IV^=MetbIonine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arglnine, S==^erine, 
T=Thrconinc, V=Vallnc, W=Tryptophan, Y=Tyrosine, 
X»Un known, *=Stop codon, /==possible nucleotide deletion, 
V^possible nucleotide insertion 










ASEQLHEAGYDAYITGLCnSMANYLGSFLSPPKI 

HVSARSKLffiPFFNKLFLMRXTVDDIPYLNLEGPDL 

QPKRDHVLHVTFPKEWKTSDLYQLFSAFGNIQIS 

WIDDTSAFVSLSQPEQVKIAVNTSKYAESYRIQT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 

PQCIPYTLQNHYYRNNSFTAPSTVGKRNLSPSQE 

EAGLEDGVSGEISDTELEQTDSCAEPLSEGRKKA 

KKLKRMKKELSPAGSISKNSPATLFEVPDTW 


3116 


A 


3 


1443 


TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRLRRQLAKLAII 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTTIDLTCSGHVSIFEFDVFTRLFQPWPTLLKN 

WQLLAVNHPGYMAFLTYDEVQERLQACRDKPG 

SYIFRPSCTRLGQWAIGYVSSDGSILQTIPANKPLS 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

QRIHVSEEQLQLYWAMDSTFELCKICAESNKDV 

KIEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 

WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 

VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 

AALGPQDPAPA 


3117 


A 


296 


3547 


ERHSSPLLQHILTHALMRNKKHShnSIWLAQHWF 

QSSIILCFSPVGRTLRVRARKFPAIVNCTAIDWFH 

AWPQEALVSVSRRFIEETKGIEPVHKDSISLFMAH 

VHTTVNEMSTRYYQNERRHNYTTPKSFLEQISLF 

K^^LLKKKQNEVSEKKERLVNGIQKLKTTASQVG 

DLKARLASQEAELQLRNHDAEALITKIGLQTEKV 

SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 

AEPALVAATAALNTLNRVNLSELKAFPNPPIAVT 

NVTAAVMVLLAPRGRVPKDRSWKAAKVFMGK 

VDDFLQALINYDKEHIPENCLKVVNEHYLKDPEF 

NPNLIRTKSFAAAGLCAAWINIIKFYEVYCDVEP 

KRQALAQANLELAAATEKLEAIRKKLVVSANYD 

lEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAWNNEGLPSDRMSTENAAIL 

THCERWPLVIDPQQQGIKWIKNKYGMDLKVTHL 

GQKGFLNAIETALAFGDVILIENLEETIDPVLDPL 

LGRNTIKKGKYIRIGDKECEFNKNFRLILHTKLAN 

PHYKPELQAQTTLLNFTVTEDGLEAQLLAEVVSI 

ERPDLEKLKLVLTKHQNDFKIELKYLEDDLLLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKVIEA 

KENERKIhffiARECYRPVAARASLLYFVn^LQM 

NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRISI 

LMESITHAVFLYTSQALFEKDKLTFLSQMAFQIL 

LRKKEIDPLELDFLLRFTVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKLPQEWKKKSLIQKLILLRAMRPDRMTY 

ALRNFVEEKLGAKYVERTRLDLVKAFEESSPATP 

IFFILSPGVDALKDLEILGKRLGFTIDSGKFHNVSL 

GQGQETVAEVALEKASKGGHWVILQNVHLVAK 

WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 

EffllPQGLLENSKITNEPPTGMLANLHAALYNFD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F==Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Pro1ine, Q=Glutamine, R»Arginine, S=Serine, 
T=Threonlne, V=Vallne, W=Tryptophan, Y=Tyrosinc 
X=lInknown, *=Stop codon, /=possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










Q 


3118 


A 


1 


226 


PYSLSTSCLGSPTSPRLEMDPNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE 


3119 


A 


1254 


4133 


PLATLTMEEQGHSEMEUPSESHPfflQLLKSNREL 
LVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPT 
QPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAY 
VDLRPWLLEIGFSPSLLTQSKWVNTDPVSRYTQ 
QLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIME 
LVGFSNESLGSLNSLACLLDHTTGILNEQGETIFIL 
GDAGVGKSMLLQRLQSLWATGRLDAGVKFFFH 
FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 
VFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDS 
. SCaPWEPAHPLVLLANLLSGKLLKGASKLLTART 
GIEVPRQFLRKKVLLRGFSPSHLRAYARJRMFPER 
ALQDRLLSQLEANPNLCSLCSVPLFCWnFRCFQH 
FRAAFEGSPQLPDCTMTLTDVFLLVTEVHLNRM 
QPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHR 
GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 
ELGPGGDQQSYEFFULTLQAFFTAFFLVLDDRVG 
TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLQG 
SGPAREDLFKNKDHFQFTNLFLCGLLSKAKQKLL 
RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 
VQVESFNQVQAMPTFIWMLRCIYETQSQKVGQL 
AARGICANYLKLTYCNACSADCSALSFVLHHFP 
KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 
VNQITDGGVKVLSEELTKYKIVTYLGLYNNQITD 
VGARYVTKILDECKGLTHLKLGKNKITSEGGKY 
LALAVKNSKSISEVGMWGNQVGDEGAKAFAEA 
LRNHPSLTTLSLASNGISTEGGKSLARALQQNTSL 
EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 
IQNQITAKGTAQLADALQSNTGITEICLNGNLIKP 
EEAKVYEDEKRIICF 


3120 


A 


43 


1004 


QLWGFAAGSDSRPAMGCDGGTIPKRHELVKGPK 

KVEKVDKDAELVAQWNYCTLSQEILRRPrVACE 

LGRLYNKDAVIEFLLDKSAEKALGKAASHIKSIK 

NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

FICPVVGLEMNGRHRFCFLRCCGCVFSERALm 

KAEVCHTCGAAFQEDDVrVLNGTKEDVDVLKTR 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP 

GPSKVKTGKPEEASLDSREKKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTTHS 

SAKRSKEESAHWVTHTSYCF 


3121 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D'^Aspartic Acid, 
E>^GIutaroic Acid, F^Phenylalanine, G=Glycine, H==Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Metliionine, 
N=Asparagine, P=ProIine, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknotvn, *=Stop codon, A-possible nucleotide deletion, 
V=possible nucleotide insertion 










NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 
AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 
FFPQQ 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEURKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGWWMNWKGSWYSMKKMSNIKIRP 

FFPQQ 


3123 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQeSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3124 


A 


3 


544 


RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

SRSRTSRMAPPASRAPQMRAAPRPAPVAQPPAA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 

SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 

GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 

NEVLKQCRLANGLA 


3125 


A 


3 


571 


GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 

AQTRGDDTDQQKTTVIENGEIRFNGKGKKIRKPR 

TIYSSLQLQALNHRFQQTQYLALPERAELAASLG 

LTQTQVKIWFQNKRSKFKKLLKQGSNPHESDPL 

QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 

MPGYSHWYSSPHQDTMQRPQMM 


3126 


A 


43 


5377 


LSVFFPIPVDGRDRGSNPSLESTSSELSTSTSEGSL 

SAMSGRNELHSRLHPHPQSSLIPMMFSPPESLLAS 

CILRGNFAEAHQVLFTFNLKSSPSSGELMFMERY 

QEVIQELAQVEHKIENQNSDAGSSTIRRTGSGRST 

LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPIPM 

LQEDFWISTALVEPTAPLREVLEDLSPPAMAAFD 

LACSQCQLWKTCKQLLETAERRLNSSLERRGRRl 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
£«GIutamic Acid, F>=Phenylalanine» G=Glycine, H=Hi$tidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=IVIethionine, 
N^Asparagine, P— Proline, Q=Glutamine, R^Afginine, S^erinCf 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Uni{nown, *^top codon, /=possibIe nucleotide deletion, 
\=possible nucleotide insertion 




• 






DHVLLNADGIRGFPWLQQISKSLNYLLMSASQT 

KSESVEEKGGGPPRCSITELLQMCWPSLSEDCVA 

SHTTLSQQLDQVLQSLREALELPEPRTPPLSSLVE 

QAAQKAPEAEAHPVQIQTQLLQKNLGKQTPSGS 

RQMDYLGTFFSYCSTLAAVLLQSLSSEPDHVEVK 

VGNPFVLLQQSSSQLVSHLLFERQVPPERLAALL 

AQENLSLSVPQVIVSCCCEPLALCSSRQSQQTSSL 

LTRLGTLAQLHASHCLDDLPLSTPSSPRTTENPTL 

ERKPYSSPRDSSLPALTSSALAFLKSRSKLLATVA 

CLGASPRLKVSKPSLSWKELRGRREVPLAAEQV 

ARECERLLEQFPLFEAFLLAAWEPLRGSLQQGQS 

LAVNLCGWASLSTVLLGLHSPIALDVLSEAFEES 

LVARDWSRALQLTEVYGRDVDDLSSIKDAVLSC 

AVACDKEGWQYLFPVKDASLRSRLALQFVDRW 

PLESCLEILAYCISDTAVQEGLKCELQRKLAELQ 

VYQKILGLQSPPVWCDWQTLRSCCVEDPSTVMN 

MILEAQEYELCEEWGCLYPIPREHLISLHQKHLL 

HLLERRDHDKALQLLRRIPDPTMCLEVTEQSLDQ 

HTSLATSHFLANYLTTHFYGQLTAVRHREIQALY 

VGSKILLTLPEQHRASYSHLSSNPLFMLEQLLMN 

MKVDWATVAVQTLQQLLVGQEIGFTMDEVDSL 

LSRYAEKALDFPYPQREKRSDSVIHLQEIVHQAA 

DPETLPRSPSAEFSPAAPPGISSIHSPSLRERSFPPT 

QPSQEFVPPATPPARHQWVPDETESICMVCCREH 

FTMFNRRHHCRRCGRLVCSSCSTKKMVVEGCRE 

NPARVCDQCYSYCNKDVPEEPSEKPEALDSSKSE 

SPPYSFVVRVPKADEVEWILDLKEEENELVRSEF 

YYEQAPSASLCIAILNLHRDSIACGHQLIEHCCRL 

SKGLTWEVDAGLLTblMKQLLFSAKMMFVKAG 

QSQDLALCDSYISKVDVLNILVAAAYRHVPSLDQ 

ILQPAAVTRLRNQLLEAEYYQLGVEVSTKTGLDT 

TGAWHAWGMACLKAGNLTAAREKFSRCLKPPF 

DLNQLNHGSRLVQDVVEYLESTVRPFVSLQDDD 

YFATLRELEATLRTQSLSLAVIPEGKIMNNTYYQ 

ECLFYLHNYSTNLAIISFYVRHSCLREALLHLLNK 

ESPPEVFDEGIFQPSYKSGKLHTLENLLESIDPTLES 

WGKYLIAACQHLQKKNYYHILYELQQFMKDQV 

RAAMTCIRFFSHKAKSYTELGEKLSWLLKAKDH 

LKIYLQETSRSSGRKKTTFFRKKMTAADVSRHM 

NTLQLQMEVTRFLHRCESAGTSQITTLPLPTLFG 

NNHMKMDVACKVMLGGKNVEDGFGIAFRVLQ 

DFQLDAAMTYCRAARQLVEKEKYSEIQQLLKCV 

SESGMAAKSDGDTILLNCLEAFKRIPPQCCFCSA 

QELEGLIQAIHNDDNKVRAYLICCKLRSAYLIAV 

KQEHSRATALVQQVQQAAKSSGDAVVQDICAQ 

WLLTSHPRGAHGPGSRK 


3127 


A 


467 


1259 


HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 

PKSEEASWSESGVSSSSGDGPFAGGEVDKRLHQL 

KTQLATLTSSLATVTQEKSRMEASYLADKKKMK 

QDLEDASNKAEEERARLEGELKGLQEQIAETKA 

RLITQQHDRAQEQSDHALMLRELQKLLQEERTQ 

RQDLELRLEETREALAGRAYAAEQMEGFELQTK 

QLTREVEELKSELQAIRDEKNQPDPRLQELQEEA 

ARLKSHFQAQLQQEMRKVIIHISFKHQPLT 


3128 


A 


1854 


798 


ASGSPAPSSSSAMAAACGPGAAGYCLLLGLHLFL 
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SEQIO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino - 

acid residue of ' 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A"AIanine OCysteine, D^Aspartic Acid, 
E-Glotamic Acid, F'Phenylalanine, G°<!lyeine, RsHistidioe, 
I=Isoleudne, K-Lysioe, I^l«ucine, M^^etbionine, 
N=Asparagine, PMProiine, Q=Glutamine, R°>Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=Tr>ptoplian, Y=Tyrosine, 
X=UnI(nown, *=^top codon, A=possible nucleotide deletion, 
\ppossible nucleotide insertion 










LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 

SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKFGKTWSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSGLITIWLLGIA 

FVVYKLFLSDGQYSPPPYSEYPPFSHRYQRFTNS 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 

NSDTKTRTASGYGGTRRR 


3129 


A 


2340 


1192 


ELARRPKQQSSEKSRIMMIRNWLTIFILFPLKLVEK 

CESSVSLTVPPVVKLENGSSTNVSLTLRPPLNATL 

VITFEITFRSKNITILELPDEVWPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISn 

NQVIGWIYFVAWSISFYPQVIMNWRRKSVIGLSF 

DFVALNLTGFVAYSVFNIGLLWVPYKEQFLLKY 

PNGVNPVNSNDVFFSLHAWLTLinVQCCLYERG 

GQRVSWPAIGFLVLAWLFAFVTMIVAAVGVITW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGLGVFSIVFDWFFIQHFCLYRKRPGYD 

QLN 


3130 


A 


31 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

WGGGGGIXAPKPSFVSYVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

NDITLHCVDQIYGVFDEKKKTLFGQLKKYPEKLII 

HCKDLRVFQFCLRYTKEEEVKRIVSGIIHHTQAP 

KliKRLFU^SYATAAQlWrVTDPKNHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PAYFVVPTPLPEENVQRFQGHGIPIWCWSCHNGS 

ALLKMSALPKEQDDGILQIQKSFLDGIYKTIHRPP 

YEIVKTEDLSSNFLSLQEIQTAYSKFKQLFLBDNST 

EFWDTDKWFSLLESSSWLDnRRCLKKAIEITEC 

MEAQNMNVLLLEENASDLCCLISSLVQLMMDPH 

CRTRIGFQSLIQKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

LGKRISKLINSSDELQDNFREFYDSWHSKSTDYH 

GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 

ATLSKLLEMMEEVQSLQEKIDERHHSQQAPQAE 

APCLLRNSARLSSLFPFALLQRHSSKPVLPTSGW 

KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 

WFRSIPAITRYWFAATVAVPLVGKLGLISPAYLF 

LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 

NLYFLYQYSTRLETGAFDGRPADYLFMLLFNM 

CIVITGLAMDMQLLMIPLIMSVLYVWAQLNRDM 

IVSFWFGTRFKACYLPWVILGFNYIIGGSVINELIG 

NLVGHLYFFLMFRYPMDLGGRNFLSTPQFLYRW 

LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 

WGQGFRLGDQ 


3132 


A 


2 


350 


FVAGWRALTAPSTSARLRAFGWQAAARLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQIMHLLSVGSILQL 
HAGWPDLLWAAHHACPRD 
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S£Qn) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D^'Aspartic Add, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
]=l50leucine, K^'Lysine, L=L€ucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonlne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 


3133 


A 


1 


2921 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 

VYKLNDNSKSDEHVDVRVDGLMLKFVIPSEVKS 

ECHQDQPRAISIQSSEMIATNTRHCPNCRHSDLEA 

LFQDFKDCDFFSKTYTSFPKSCDNFNLLHPIFQRH 

AHEQDTKMHEIYKGMTPQLNKNTLKTSAATDV 

WAVYFSQFWIDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 

RLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESLILLSENLRKDVEAVTGSPASQTSICIGILLR 

SAELALLLIffVDQANTLKSPVSESVSPVVPDYLP 

TENGDFLSSKJRKQISRDINRIRSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKDS 

FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 

KGNETIESIFKAEDLLPEAASLSENLDISKEETPPV 

RTLKSQSSLSGKPKERCPPNLAPLCVSYKNMKRS 

SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 

NKKNSTTNYRGTAESVNAGANLQNYGETSPDAI 

STNSEGAQENHDDLMSWVFKITGVNGEIDIRGE 

DTEICLQVNQVTPDQLGNISLRHYLCNRPVGSDQ 

KAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFL 

QCfflENFSTEFLtSSLMNIQHFLEDETVATVMPM 

KIQVSNTKINLKDDSPRSSTVSLEPAPVTVHIDHL 

VVERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLHHIKKMTVE 


3134 


A 


9 


1579 


EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA 

ERERVEDLFEYEQCKVGRGTYGHVYKARRKDG 

KDEKEYALKQIEGTGISMSACREIALLRELKHPN 

VIALQKVFLSHSDPIKVWLLFDYAEHDLWHIIKFH 

RASKANKKPMQLPRSMVKSLLYQILDGIHYLHA 

NWVLHRDLKPANILVMGEGPERGRVKIADMGF . 

ARLFNSPLKPLADLDPVWTFWYRAPELLLGAR 

HYTKAIDIWAIGCIFAELLTSEPIFHCRQEDIKTSN 

PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 

LQKDFRRTTYANSSLIKYMEKHKVKPDSKVFLL 

LQKLLTMDPTKRITSEQALQDPYFQEDPLPTLDV 

FAGCQIPYPKREFLNEDDPEEKGDKNQQQQQNQ 

HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 

AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 

SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSQS 

QSTLGYSSSSQQSSQYHPSHQAHRY 


3135 


A 


3 


1111 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQ 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

LHAQPHHLLLPAAAAAASANAKSRRPKEKREKE 

RRRHGLGGAREAGGASREENGEVKPLPRDKIKD 

KIKERDKEKEREKKKHKVMNEIKKENGEVKILL 

KSGKEKPKTl^DLQIKKVKKKKKKK^ 

KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 

KDYVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C'Ki^ysteine, D-Aspartic Add, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidlne» 
I'^Isoleucine, K>=Lysine, Lr=Leucine, M«Methionine, 
N^^Asparagine, P=Proline, Q=Glutamine, R=Arglnine, S^^rine, 
T=Threonine, V= Valine, W^Tryptophan, Y=TyrosiDe, 
X=lJnknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










NNLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 


3136 


A 


1442 


682 


TAAMSIFTPTNQIRLTNVAVVRMKRAGKRFE 

YKNKWGWRSGVEKDLDEVLQTHSVFVNVSKG 

QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 

KERHTQLEQMFRDIATIVADKCVl^ETKRPYTVI 

LIERAMKDIHYSVKTNKSTKQQALEVnCQLKEK 

MKIERAHMIU.RFILPVNEGKKLKEKLKPLIKVffiS 

EDYGQQLEIVCLIDPGCFREIDELIKKETKGKGSL 

EVLNLKDVEEGDEKFE 


3137 


A 


1 


3143 


MVEGKRHVLHGGRQERMRAKQKGKPLIKSSDL 

VRLIHYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFVVGERVWVNGVKPGWQY 

LGETQFAPGQWAGVVLDDPVGKNDGAVGGVR 

YFECPALQGIFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVffLRESVLNSSVKT 

GNESGSNLSDSGSVKRGEKDLRLGDRVLVGGTK 

TGVVRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPIHKVmiGFPSTSPAKA 

KKTKRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HIEQLLAERDLERAEVAKATSHICEVEKEIALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARJGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKAYQAEVDKLRAANEKYAQEVAGLKDKVQQ 

ATSElSnVIGLMDNWKSKLDSLASDHQKSLEDLKA 

TLNSGPGAQQKEIGELKAVMEGIKMEHQLELGN 

LQAKHDLETAMHVKEKEALREKLQEAQEELAG 

LQRHWRAQLEVQASQHRLELQEAQDQRRDAEL 

RVHELEKLDVEYRGQAQAIEFLKEQISLAEKKML 

DYERLQRAEAQGKQEVESLREKLLVAENRLQAV 

EALCSSQHTHMIESNDISEETIRTKETVEGLQDKL 

NKRDKEVTALTSQTEMLRAQVSALESKCKSGEK 

KVDALLKEKRRLEAELETVSRKTHDASGQLVLIS 

QELLRKERSLNELRVLLLEANRHSPGPERDLSRE 

VHKAEWRIKEQKLKDDIRGLREKLTGLDKEKSL 

SDQRRYSLIDPSSAPELLRLQHQLMSTEDALRDA 

LDQAQQVEKLMEAMRSCPDKAQTIGNSGSANGI 

HQQDKAQKQEDKH 


3138 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASWDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREADLRVARHFQCTDPKNCS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^'Alanine OCysteine, B^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I'^lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R<=Arginine, S=^erine» 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *^top codon, /='possible nucleotide deletion, 
^possible nucleotide insertion 










VVSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDnXIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 


3139 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSrWPQEILAKyTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASWDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARPIFQCTDPKNCS 

VVSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 


3140 


A 


1 


4939 


SAALGASLAPRPGLPGVHGRGPGTLSGRAMEG 

AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 

WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 

AGDEIVGINDIGLSGFRQEAICLVKGSHKTLKLV 

VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 

SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATKSHEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 

LQASLSSSDVRPPQSPHSGRHPPLYSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWQ 

GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 

PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASITWADG 

ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAIWGKPTRRSDRFATTLRI^IQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQm 
NO: 


Metbod 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amiiio 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D^Aspartic Acid, 
£=Glutamic Acid, F-Phenylalanine, G^lycine, H~Histidine, 
I=ls6leucine, K=Lysine, LF=Leucine, M=Methionine, 
N=Asparaglne, P=Proline, Q=Glutaminc R=Arglnine, S=>SeriDe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, A^'possible nucleotide deletion, 
\spossible nucleotide insertion 










AGTYKDHLKEAQARVLRATSFKRRDLDPNPGDL 

YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 

HPPRIGGRRIIFTAEQKLKSYSEPEKMNEVGLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEEtSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 

HPPSQKAPNPPTFSELSHCRGAPELPREGRGRAG 

TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

LLPPKQQHLRLQTATMETSRSPSPQFAPQKLTDK 

PPLLIQDEDSTRmRVMDNNTTVKMVPIKIVHSES 

QPEKESRQSLACPAEPPALPHGLEKDQIKTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

IVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKEEPSVPAAVS 

LATNSTYYSTSAPKAELLIKMKDLQEQQEHEEDS 

GSDLDHDLSVKKQELIESISRKLQVLREARESLLE 

DVQANTVLGAEVEAIVKGVCKPSEFDKFRMFIG 

DLDKVVNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELKENLDRRERIVF 

DILANYLSEESLADYEHFVKMKSALIIEQRELED 

KIHLGEEQLKCLLDSLQPERGK 


3141 


A 


97 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVTVNPWHMKKAFKVMNELRSQNLLCDVT 

IVAEDMEISAHRVVLAACSPYFHAMFTGEMSESR 

AKRVRIKEVDGWTLRMLIDYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTiDLLNKANTYAEQHFADVVLSE 

EFLNLGIEQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEALVKNSSACK^^^IEAMKYHLLPTEQRILMK 

SVRTRLRTPMNLPKLMVVVGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRVRTVDSYDPVKDQWTSVANMRDRR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNIKS 

NEWFHVAPMNTRRSSVGVGWGGLLYAVGGYD 

GASRQYLSTVECYNATTNEWTYIAEMSTRRSGA 

GVGVLNNLLYAVGGHDGPLVRKSVEVYDPTTN 

AWRQVADMNMCRRNAGVCAVNGLLYVVGGD 

DGSCNLASVEYYNPTTDKWTWSSCMSTGRSYA 

GVTVIDKPL 


3142 


A 


1211 


1311 


FSM.TTEKVAHAKEEhn.SMHQMLDQTLLELNN 
M 


3143 


A 


1809 


1041 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTSI 

KEEPKEAKHPDSQSMEESKLKNDDRKTPVNWK 

DSRGTRVAVSSPMSQHQSYIQYLHAYPYPQMYD 

PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 

MSGREETEKVNTSPSVNTKTTTESKALDLLQQH 

ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylfllanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, W(=Methioninc, 
N=^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T==Threonine, V=Valinc, W=Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










QRHLHTHHHTHVGMGYPLIPGQYDPFQGLTSAA 
LVASQQVAAQASASGMFPGQRR 


3144 


A 


78 


604 


SVSGIVLDLLPYLHFLSNMNLDGSAQDPEKREYS 

SVCVGREDDIKKSERMTAVVHDREVVIFYHKGE 

YHAMDIRCYHSGGPLHLGDIEDFDGRPCIVCPW 

HKYKITLATGEGLYQSINPKDPSAKPKWCSKGIK 

QRJHTVTVDNGNIYVTLSNEPFKCDSDFYATGDF 

KVIKSSS 


3145 


A 


2 


333 


RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 
CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 
GCFLNHHRRHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 


3146 


A 


3 . 


1151 


VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGIVIAEALQNQLAWLENVWLWITF 

LGDPKILFLFYFPAAYYASRRVGIAVLWISLITEW 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRWVRVMPSLAYCTFLLAVGLSRIFILAH 

FPHQVLAGLITGAVLGWLMTPRVPMERELSFYG 

LTALALMLGTSLIYWTLFTLGLDLSWSISLAFKW 

CERPEWIHVDSRPFASLSRDSGAALGLGIALHSPC 

YAQVRRAQLGNGQKIACLVLAMGLLGPLDWLG 

HPPQISLFYIFNFLKYTLWPCLVLALVPWAVHMF 

SAQEAPPIHSS 


3147 


A 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMVAA 

ALGGHPLLGVSATLNSVLNSNAIKNLPPPLGGAA 

GHPGSAVSAAPGILYPGGNKYQTroNYQPYPCAE 

DEECGTDEYCASPTRGGDAGVQICLACRKRRKR 

CMRHAMCCPGNYCKNGICVSSDQNHFRGEIEETI 

TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 

VCLRSSDCASGLCCARHFWSKICKPVLKEGQVC 

TKHRRKGSHGLEIFQRCYCGEGLSCRIQKDHHQ 

ASNSSRLHTCQRH 


3148 


A 


1 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLH 

TPKLEHLDRVLYEWFLGKRSEGVPVSGPMLIEK 

AKDFYEQMQLTEPCVFSGGWLWRFKARHGIKK 

LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 

EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGK 

DRLTVLMCANATGSHRLKPLAIGKCSGPRAFKGI 

QHLPVAYKAQGNAWVDKEBFSDWFHHIFVPSVR 

EHFRTIGLPEDSKAVLLLDSSRAHPQEAELVSSN 

VFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVP 

LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 

WRKLWPSVAFAEGSSSEEELEAECFPVKPHNKSF 

AHILELVKEGSSCPGQLRQRQAASWGVAGREAE 

GGRPPAATSPAEVVWSSEKTPKADQDGRGDPGE 

GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 

QLRALRAVFRSQQQVRRRRGALGAWKVEALQ 

EGPGGCGATAQSPLPCSSTAGDN 


3149 


A 


132 


4125 


VAVMISTAPLYSGVHNWTSSDRIRMCGINEERRA 

PLSDEESTTGDCQHFGSQEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEK3NITFILATLGTGWVEGTLPLVTTNFSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D-Aspartic Acid, 
£=Glutamic Acid, F=PbenylaIanine, G^GIycine, H-Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, lVI=Methionine, 
N'^Asparagine, P=Proline, Q=Glulamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Un known, *^top codon, /=possibIe nucleotide deletion, 
\=possible nucleotide insertion 










LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVPTPVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPVPAPTPAPIFTPAPTPMPAATP 

AAIPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFVIFPEIVRNGDPSTWVKNSTALISTIPG 

TYVGVANPVPASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSVVSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LILDVVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDVVFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAWRSSHRPKCRK 

LPSDPQESTKKSPRGASDSGKEHNGVRGKHKHR 

KPTKPESQSPGKRADSHEEGSLEKKAKSSFRDFIP 

VVLSTRTRSQSDLKARKQKTSSSQSLEHRLRNRN 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

GKRKCKTKHMATVSEEAKGKGRWSQQKTRSPK 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQIPPE 

ARRLIVNKNAGETLLQRAARLGYKDWLYCLQK 

DSEDVNHRDNAGYTALHEACSRGWTDILNILLE 

HGA 


3150 


A 


3 


2795 


SLRMHNLSILVRQIKFYYQETLQQLMMSLPNVLI 

IGKNPFSEQGTEEVKKLLLLLLGCAVQCQBGKEEF 

lERIQGLDFDTKAAVAAfflQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPLLKNMALHLKRLIDERDEH 

SETIIELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHLSVELADAKAKIRRLRQELEEKTEQLLD 

CKQELEQMEIELKRLQQENMNLLSDARSARMYR 

DELDALREKAVRVDKLESEVSRYKERLHDIEFY 

KARVEELKEDNQVLLETKTMLEDQLEGTRARSD 

KLHELEKENLQLKAKLHDMEMERDMDRKKIEE 

LMEENMTLEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLICLEMENQSLTK 

TVEELRTTVDSVEGNASKILKMEKENQRLSKKV 

EILENEIVQEKQSLQNCQNLSKDLMKEKAQLEKT 

lETLRENSERQIKILEQENEHLNQTVSSLRQRSQIS 

AEARVKDIEKENKILHESIKETSSKLSKIEFEKRQI 

KKELEHYKEKGERAEELENELHHLEKENELLQK 

KITNLKITCEKIEALEQENSELERENRKLKKTLDS 

FKNLTFQLESLEKENSQLDEENLELRRNVESLKC 

ASMKMAQLQLENKELESEKEQLKKGLELLKASF 

KKTERLEVSYQGLDIENQRLQKTLENSNKKIQQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E==Glutamic Acid^ F=Phenylalanine, G=Glycine, H=Histidine, 
I=:Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Un known, *'=Stop codon, /==possibIe nucleotide deletion, 
V=possible nucleotide insertion 










ESELQDLEMENQTLQKNLEELKISSKRLEQLEKE 

l^SLEQETSQLEKDKKQLEKENKRLRQQAEIKD 

mEENNVKJGNLEKJS>KTLSmGIYKESCVRLE 

ELEKENKELVKRATIDIKTLVTLREDLVSEKLKT 

QQMNNDLEKLTHELEKIGLNKERLLHDEQSTDD 

SRYiaLESKLESTLKKSLEKEEKIAALEARLEES 

TNYNQQLRQELKTVKKK 


3151 


A. 


2 


2515 


GFWLHLTLLGASLPAALGWMDPGTSRGPDVGV 

GESQAEEPRSFEVTRREGLSSHNELLASCGKKFC 

SRGSRCVLSRKTGEPECQCLEACRPSYVPVCGSD 

GRFYENHCKLHRAACLLGKRITVIHSKDCFLKGD 

TCTMAGYARLK>rVLLALQTRLQPLQEGDSRQDP 

ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 

KKQDLDEDLLGCSPGDLLRFDDYNSDSSLTLREF 

YMAFQWQLSLAPEDRVSVTTVTVGLSTVLTCA 

VHGDLRPPIIWKRNGLTLNFLDLEDINDFGEDDS 

LYITKVTTIHMGNYTCHASGHEQLFQTHVLQVN 

VPPVIRVYPESQAQEPGVAASLRCHAEGIPMPRIT 

WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 

DTGAYTCIAKNEVGVDEDISSLFIEDSARKTLANI 

LWREEGLSVGNMFYVFSDDGIIVIHPVDCEIQRH 

LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 

RJSIRYIYVAQPALSRVLVVDIQAHKVLQSIGVDPL 

PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVITE 

ASTGQSQHLIRTPFAGVDDFFIPPTNLIINfflRFGFI 

FNKSDPAVHKVDLETMMPLKTIGLHHHGCVPQA 

MAHTHLGGYFFIQCRQDSPASAARQLLVDSVTD 

SVLGPNGDVTGTPHTSPDGRFIVSAAADSPWLHV 

QEITVRGEIQTLYDLQiNSGISDLAFQRSFTESNQ 

YNIYAALHTEPDLLFLELSTGKVGMLKNLKEPPA 

GPAQPWGGTHRIMRDSGLFGQYLLTPARESLFLI 

NGRQNTLRCEVSGIKGGTTVVWVGEV 


3152 


A 


1 


2645 


GAGWQVSLTGRWSPGREAGAGEVRQDPGSTAA 

SPSSCDADLSARMARGERRRRAVPAEGVRTAER 

AARGGPGRRDGRGGGPRSTAGGVALAVVVLSL 

ALGMSGRWVLAWYRARRAVTLHSAPAVLPADS 

SSPAVAPDLFWGTYRPHVYFGMKTRSPKPLLTG 

LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 

HDGLSFGRQHIQDGALRLTTEFVKRPGGQHGGD 

WSWRVTVEPQDSGTSALPLVSLFFYVVTDGKEV 

LLPEVGAKGQLKFISGHTSELGDFRFTLLPPTSPG 

DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW 

FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 

QFLIQQVTLKIPISIEFVFESGSAQAGGNQALPRLA 

GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 

QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 

QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 

QLVVQRWDPSLTREALGHWLGLLNADGWIGRE 

QILGDEARARVPPEFLVQRAVHANPPTLLLPVAH 

MLEVGDPDDLAFLRKALPRLHAWFSWLHQSQA 

GPLPLSYRWRGRDPALPTLLNPKTLPSGLDDYPR 

ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 

AEVAAELGPLAASLEAAESLDELHWAPELGVFA 

DFGNHTKAVQLKPRPPQGLVRVVGRPQPQLQYV 

DALGYVSLFPLLLRLLDPTSSRLGPLLDILADSRH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=Alanine C=<rysteine, B^Aspartic Acid, 
E=Glutamic Acid, F'=Phenyla!anine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaralne, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=l)nknown, *=Stop codon, /^possible nucleotide deletion, 
\-possibIe nucleotide insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALGALHHYGHLEGPHQARAAKLHGE 
LRANVVGNVAVRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 


A 


1 


4312 


MVIKTDELPAAAPADSAREHGSQAGGKGRPGAA 

AVLLADLERDARQGECALPGAAMAGLAPLKPE 

ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELE 

SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 

DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 

EVIVIQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 

QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 

EGLQEGSVLRWEEPYTVREARIHVRHVRDLLKS 

LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 

KGLEMDPIDCTPPEYILPGSRERPLCPLQPQNRD 

WKPLQCLKVLTMSGWNPPPG>mKMHGDLMYLF 

VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 

RFLSHSLVELLNQISPTFKKNFAVLQKKRVQRHP 

FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 

RLGYEEHIPGQTRDWNEELQTTRELPRKNLPERL 

LRERAIFKVHSDFTAAATRGAMAVIDGNVMAIN 

PSEETKMQMFIWNNIFFSLGFDVRDHYKDFGGD 

VAAYVAPTNDLNGVRTYNAVDVEGLYTLGTVV 

VDYRGYRVTAQSIIPGILERDQEQSVIYGSIDFGK 

TWSHPRYLELLERTSRPLKILRHQVLNDRDEEV 

ELCSSVECKGIIGNDGRHYILDLLRTFPPDLNFLP 

VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 

FVEHRYLLFMKLAALQLMQQNASQLETPSSLEN 

GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 

ELAETIAADDGTDPRSREVIRNACKAVGSISSTAF 

DIRFNPDIFSPGVRFPESCQDEVRDQKQLLKDAA 

AFLLSCQIPGLVKDCjMEHAVLPVDGATLAEVMR 

QRGINMRYLGKVLELVLRSPARHQLDHVFKIGIG 

ELITRSAKHIFKTYLQGVELSGLSAAISHFLNCFLS 

SYPNPVAHLPADELVSKKRNKRRKNRPPGAADN 

TAWAVMTPQELWKNICQEAKN YFDFDLECETV 

DQAVETYGLQKITLLREISLKTGIQVLLKEYSFDS 

RHKPAFTEEDVLNIFPVVKHVNPKASDAFHFFQS 

GQAKVQQGFLKEGCELINEALNLFNNVYGAMH 

VETCACLRLLARLHYIMGDYAEALSNQQKAVL 

MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 

LSLLYRARYLMLLVFGEDHPEMALLDNNIGLVL 

HGVMEYDLSLRPLENALAVSTKYHGPKALKVAL 

SHHLVARVYESKAEFRSALQHEKEGYTIYKTQL 

GEDHEKTKESSEYLKCLTQQAVALQRTMNEIYR 

NGSSANIPPLKFTAPSMASVLEQLNVINGILFIPLS 

QKDLENLKAEVARRHQLQEASRNRDRAEEPMA 

TEPAPAGAPGDLGSQPPAAKDPSPSVQG 


3154 


A 


416 


4082 


KFKLIKIMLLTLHLLPVVSKFSFVSLSAPQHWSCP 

EGTLAGNGNSTCVGPAPFLIFSHGNSIFRIDTEGT 

NYEQLVVDAGVSVIMDFHYNEKRIYWVDLERQ 

LLQRVFLNGSRQERVCNIEKNVSGMAINWINEEV 

IWSNQQEGIITVTDMKGNNSHILLSALKYPANVA 

VDPVERFDFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSLDVLDKRLFWIQYNREGSNSLICSCD 

YDGGSVHISKHPTQHNLFAMSLFGDRIFYSTWK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=<;iutamic Acid, F=Phenylalanine, G=GlycIne» H»Histidine, 
I==]soleucine, K=Lysine, Lr=Leuclne, M='Methionine, 
N^Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serinc, 
T=Threonlne, V^Valinc, W'^Tryptophan, V^Tyrosioc, 
X^'Unknown, *=Stop codon»/=possible nucleotide deletion, 
\=possible nucleotide insertion 




• 






MKTIWIANKHTGKDMVRINLHSSFVPLGELKVV 

HPLAQPKAEDDTWEPEQKLCKLRKGNCSSTVCG 

QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 

DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VSWECDCFPGYDLQLDEKSCAASGPQPFLLFANS 

QDIRHMHFDGTDYGTLLSQQMGMVYALDHDPV 

ENKIYFAHTALKWIERANMDGSQRERLIEEGVD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKIITIENISQPRGIAVHPMAKRLFWTDTGINPRIE 

SSSLQGLGRLVIASSDLIWPSGITIDFLTDKLYWC 

DAKQSVIEMANLDGSKRRRLTQNDVGHPFAVA 

VFEDYVWFSDWAMPSVIRVNKRTGKDRVRLQG 

SMLKPSSLVVVHPLAKPGADPCLYQNGGCEHIC 

KKRLGTAWCSCREGFMKASDGKTCLALDGHQL 

LAGGEVDLKNQVTPLDILSKTRVSEDNITESQHM 

LVAEIMVSDQDDCAPVGCSMYARCISEGEDATC 

QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 

NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 

PPHLREDDHHYSVRNSDSECPLSHDGYCLHDGV 

CMYIEALDKYACNCVVGYIGERCQYRDLKWWE 

LRHAGHGQQQKVIWAVCVWLVMLLLLSLWG 

AHYYRTQKLLSKNPKNPYEESSRDVRSRRPADT 

EDGMSSCPQPWFVVIKEHQDLKNGGQPVAGED 

GQAADGSMQPTSWRQEPQLCGMGTEQGCWIPV 

SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 

SLLSANPLWQQRALDPPHQMELTQ 


3155 


A 


533 


212 


GTSGWYWERLAERRGRLWSREEAMATMENKVI 
CALVLVSMLALGTLAEAQTETCTVAPRERQNCG 
FPGVTPSQCANKGCCFDDTVRGVPWCFYPNTID 
VPPEEECEF 


3156 


A 


2 


1585 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAP 

AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 

NGFAERRIDKFGFP/GSQGAEGALEEVPLEVLRQ 

RESKWLDMLW^WDKWMAKKHKKIRLRCQKGI 

PPSLRGRAWQYLSGGKVKLQQNPGKFDELDMSP 

GDPKWLDVIERDLHRQFPFHEMFVSRGGHGQQD 

LFRVLKAYTLYRPEEGYCQAQAPIAAVLLMHMP 

AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 

FSRTLPWSSVLRVWDMFFCEGVKIIFRVGLVLLK 

HALGSPEKVKACQGQYETIERLRSLSPKIMQEAF 

LVQEVVELPVTERQffiREHLLQLRRWQETRGELQ 

CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 

APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEKP 

PAPNQAMVVAAAGDACPPQHVPPKDSAPKDSAP 

QDLAPQVSAHHRSQESLTSQESEDTYL 


3157 


A 


3 


601 


SSAMGSRSSHAAVIPDGDSIRRETGFSQASLLRLH 

HRFRALDRNKKGYLSRMDLQQIGALAVNPLGDR 

IIESFFPDGSQRVDFPGFVRVLAHFRPVEDEDTET 

QDPKKPEPLNSRRNKLHYAFQLYDLDRDGKISR 

HEMLQVLRLMVGVQVTEEQLENIADRTVQEAD 

EDGDGAVSFVEFTKSLEKMDVEHKMSIRILK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (ABAIanine CsCysteine, l>^Aspartic Add, 
E«Glntamic Add, F=Phenylalanine, G=GlyGine, H-Histldine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N°Asparagine, P^ProIine, Q=Glutamine, R'=Arginine, S=Serine, 
T«Tlirconine, V-Valine, W=Tr>'ptoplian, V=Tyrosine, 
X-Unknown, *»Stop codon, /^possible nucleotide deletion, 
\==possible nucleotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEGSMSTLSNFTQTLEDVFRRIFITYM 
DNWRQNTTAEQEALQAKVDAENFYYVILYLMV 
MIGMFSFnVAILVSTVKSKRREHSNDPYHQYIVE 
DWQEKYKSQILNLEESKATIHENIGAAGFKMSP 


3159 


A 


3 


416 


PWGAAELDMGRRDAQLLAALLVLGLCALAGSE 

KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 

DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 

YPGISPEECASRKCCFSNFIFEVPWCFFPKSVEDC 

HY 


3160 


A 


179 


409 


KPKTKILmVYYPELFVWVSQEPFPNKDMEGRL 
PKGRLPVPKEVNRKKNDETNAASLTPLGSSELRS 
PRISYLHFF 


3161 


A 


683 


1186 


LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

IDGRRKIAFAITAIKGVGRRYAHWLRKADIDLT 

KRAGELTEDEVERVITIMQNPRQYKIPDWFLNRQ 

KDVKDGKYSQVLANGLDNiaREDLERLKKIRA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 


3162 


A 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYGIPVTGVLDQT 

TIEWMKKPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHITYSIHNYTPKVGELDTRKAIRQAFDVW 

QKVTPLTFEEVPYHEIKSDRKEADIMIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPGIGGDTHFDSDEPW 

TLGNANHDGNDLFLVAVHELGHALGLEHSSDPS 

AIMAPFYQYMETHNFKLPQDDLQGIQKIYGPPAE 

PLEPTRPLPTLPVRRIHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNICDGNFNTVALFRGEMFVFKDR 

WFWRLRNNRVQEGYPMQIEQFWKGLPARIDAA 

YERADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

YSEERRATDPGYPKPITVWKGIPQAPQGAFISKE 

GYYTYFYKGRDYWKFDNQKLSVEPGYPRNILRD 

WMGCNQKEVERRKERRLPQDDVDIMVTINDVP 

GSVNAVAVVPCILSLCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV 


3163 


A 


1235 


2223 


SRLSLQFYVSFRRTGLFTCKLIVEIFFRNYMNDSL 

RTNVFVRFQPETIACACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQA 

SKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRS 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 

HHNHGSPHLKAKHTRDDLKSSNRHGHKRKKSRS 

RSQSKSRDHSDAAKKHRHERGHHRDRRERSRSF 

ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 


3274 


DCRLQAAMPTNFTVVPVEAHADGGGDETAERT 

EAPGTPEGPEPERPSPGDGNPRENSPFLNNVEVE 

QESFFEGKNMALFEEEMDSNPMVSSLLNKLANY 

TNLSQGVVEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVILFLRLTWIVGVAGVLESFLIVAMC 

CTCTMLTAISMSAIATNGWPAGGSYYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYTLGTIEIFLTYISP 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystelne, D^'Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=l50leucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Vaiinc, W=Tryptophan, Y-Tyrosinc, 
X=Unknown, *»Stop codon, possible nucleotide deletion, 
V^'possible nucleotide insertion 










GAAIFQAEAAGGEAAAMLHNMRVYGTCTLVLM 

ALWFVGVKYVNKLALVFLACWLSILAIYAGVI 

KSAFDPPDIPVCLLGNRTLSRRSFDACVKAYGIH 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGEPGAASGVFLENLWSTYAHAGAFVEKKGVFS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GIMAGSNRSGDLKDAQKSIPTGTILAIVTTSFIYLS 

CIVLFGACffiGWLRDKFGEALQGNLVIGMLAW 

PSPWVIVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGILI 

ASLDSVAPILSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKFYHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTIVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASWKQED 

l^FSWKNFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHIDVWWIVHDGGMLMLLPFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMFLY 

HLRISAEVEVVEMVENDISAFTYERTLMMEQRS 

QMLKQMQLSKNEQEREAQLIHDRNTASHTAAA 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 

SGFKDLFSMKPDQSNVRRMHTAVKLNGVVLNK 

SQDAQLVLLNMPGPPKNRQGDENYMEFLEVLTE 

GLNRVLLVRGGGREVITIYS 


3165 


A 


3 


2681 


GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWF VFDARRC YLY YFKSPQDALPLGHLDl AD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

ARNVLAVETVPGELVGEQAANQPAPGHPNSINF 

YSLKQWGNELKNSMSSFRPGRGHNDSRRTVFYT 

NEEWELLDPTPKDLEESIVQEEKKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLKDIIGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

IIKLSEGEGNGPPPTVAPSSPSVVPVARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQIESKYLILLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWE^^V^FASTVNREMMCSPEL 

KNLIRAGIPHEHRSKVWKWCVDRHTRKFKDNTE 

PGHFQTLLQKALEKQNPASKQIELDLLRTLPNNK 

HYSCPTSEGIQKLRNVLLAFSWRNPDIGYCQGLN 

RLVAVALLYLEQEDAFWCLVTIVEVFMPRDYYT 

KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 

DYTLITFNWFLVVFVDSVVSDILFKIWDSFLYEGP 

KVIFRFALALFKYKEEEILKLQDSMSIFKYLRYFT 

RTELDARSGTDAPTTWRKSGWS 


3166 


A 


10 


4070 


FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D^Aspartic Add, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arglnine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










TRAFADLLVERQTGQQDSDPYSPVTIDQILEMVN 

GQRGLVLYYSLAAGYLYSWLLAPGAGIVKFHEH 

YLGENTVENSSDFQASSSVTLPTATGSALEQHIAS 

VREALGVESHYSRACASSETESEAGDIMDQQFEE 

MNNKLNSVTDPTGFLRMVRR^WLImSCQSMTS 

LFSNTVSPTQDGTSSLPRRQSSFAKPPLRALYDLL 

lAPMEGGLMHSSGPVGRHRQLILVLEGELYLIPF 

ALLKGSSSNEYLYERFGLLAVPSIRSLSVQSKSHL 

RKNPPTYSSSTSMAAVIGNPKLPSAVMDRWLWG 

PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 

ALTQAECVHFATfflSWKLSALVLTPSMDGNPASS 

KSSFGHPYTIPESLRVQDDASDGESISDCPPLQEL 

LLTAADVLDLQLPVKLWLGSSQESNSKVAADG 

VIALTRAFLAAGAQCVLVSLWPVPVAAFKMFIH 

AFYSSLLNGLKASAALGEAMKVVQSSKAFSHPS 

NWAGFMLIGSDVKLNSPSSLIGQALTEILQHPER 

ARDALRVLLHLVEKSLQRIQNGQRNAMYTSQQS 

VENKVGGIPGWQALLTAVGFRLDPPTSGLPAAV 

FFPTSDPGDRLQQCSSTLQSLLGLPNPALQALCK 

LITASETGEQLISRAVKNMVGMLHQVLVQLQAG 

EKEQDLASAPIQVSISVQLWRLPGCHEFLAALGF 

VLCEVGQEEVILKTGKQANRRTVHFALQSLLSLF 

DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 

QPPFSPTGADSIAiSDAISVYSLSSIASSMSFVSKPE 

GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 

PQTRPAGNKDEEEYEGFSIISNEPLATYQENRNTC 

FSPDHKQPQPGTAGGMRVSVSSKGSISTPNSPVK 

MTLIPSPNSPFQKVGKLASSDTGESDQSSTETDST 

VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 

RSGGQVSKSNNPEDGVQAPSSTAVFRASETSAFS 

RPVLSHQKSQPSPVTVKPKPPARSSSLPKVSSGYS 

SPTTSEMSDCDSPSQHSGRPSPGCDSQTSQLDQPL 

FKLKYPSSPYSAHISKSPKNMSPSSGHQSPAGSAP 

SPALSYSSAGSARSSPADAPDIDKLKMAAIDEKV 

QAVHNLKMFWQSTPQHSTGPMKIFRGAPGTMTS 

KRDVLSLLNLSPRPNKKEEGVDKLELKELSLQQH 

DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 

ARPLRLPSGNGYKFLSPGRFFPSSKC 


3167 


A 


1 


762 


AARRRQKGKEENMMMDLFETGSYFFYLDGENV 

TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 

DSSGEEHVLAPPGLQPPHCPGQCLIWACKTCKRK 

SAPTDRRKAATLRERRRLKKINEAFEALKRRTVA 

NPNQRLPKVEILRSAISYIERLQDLLHRLDQQEK 

MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 

VSDHSRGLVITAKEGGASIDSSASSSLRCLSSIVDS 

ISSEERKLPCVEEVVEK 


3168 


A 


701 


246 


TSRRVTMKFNPFVTSDRSKNRKRHFNAPSHVRR 

BaMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKVVQVYRKKYVIYIERVQREKANGT 

TVHVGIHPSKWITRLKLDKDRKKILERKAKSRQ 

VGKEKGKYKEELIEKMQE 


3169 


A 


156 


3168. 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGWVFGGFMVVSAIGIFLVSTFSMKETSYEEA 
LANQRKEMAKTHHQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 



272 



wo 01/57190 



PCT/USOl/04098 



SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine C*=Cysteinc, D=>Aspartic Acid, 
£=GIutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
]=Isoleucine, K=Lysine, L=Leucine, M=iVf ethionine, 
^=:;Vsparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon» /=possible nucleotide deletion, 
\Fpossible nucleotide insertion 










AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEKKVAKVEPAVSSWNSIQVLTSKAAILETA 

PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLIEI 

LSEKAGIIQDTWHKATQKGDPVAILKRQLEEKEK 

LLATEQEDAAVAKSKLRELNKEMAABKAKAAA 

GEAKVKKQLVAREQEITAVQARMQASYREHVK 

EVQQLQGKIRTLQEQLENGPNTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEAVRQDEQQRKALEAKAAAFEKQVLQLQ 

ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERIRSIEALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 

ACKEKLHSLTQAKEESEKQLCLIEAQTMEALLAL 

LPELSVLAQQNYTEWLQDLKEKGPTLLKHPPAP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEIVEKLKGELESSDQVREHTSHLEAE 

LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 

DAAKSEAQKQSDELALVRQQLSEMKSHVEDGDI 

AGAPASSPEAPPAEQDPVQLKTQLEWTEAILEDE 

QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 

ESSETEEASQLKERLEKEKKLTSDLGRAATRLQE 

LLKTTQEQLAREKDTVKKLQEQLEKAEDGSSSK 

EGTSV 


3170 


A 


6730 


4027 


THASEKYSYGHLPTHSITAHPMVTIRISDRQRLIQ 

PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGVLPDGDSSLEDHALPVTVPTGASEEQLE 

KKAVQGAELSEAGNGKRAVHEEIRPVDFKQRNK 

ADKGVSLSKDPSCQTQISDSPADASPPTGLPDAE 

DSEVSSQKPDEEKAVTPSPEQVFAECSQKRILGLL 

AAMLPPLKSGPTVPLIDLEHVLPLMFQVVISNAG 

HLNETYHLTLGLLGQLIIRLLPAEVDAAVIKVLSA 

KHNLFAAGDSSIVPDGWKTTHLLFSLGAVCLDS 

RVGLDWACSMAEILRSLNSAPLWRDVIATFTDH 

CIKQLPFQLKHTNIFTLLVLVGFPQVLCVGTRCV 

YMDNANEPHNVIILKHFTEKNRAVIVDVKTRKR 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSN4DSHCPEAVEATWVLSLALKGLY 

KTLKAHGFEEIRATFLQTDLLKLLVKKCSKGTGF 

SKTWLLRDLEILSIMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRICFLMAHDALNAPLHILRAIYELQMKKTDYFF 

LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 

EQSAKAVDTDMIILPCLSRPARCDQATAESNPVT 

QKLISSTESELQQSYAKQRRSKSAALLHKELNCK 

SKRAVRDYLFRVNEATAVLYARHVLASLLAEWP 

SHVPVSEDELELSGPAHMTYILDMFMQLEEKHE 

WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 

TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 

PERDFQLNQKALSPSSQFPSAEILRHIR 


3171 


A 


557 


89 


GTRAGPVKDREAFQRLNFLYQAAHCVLAQDPEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locQtion 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G==Glycinc, H»Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y'=Tyrosine, 
X'=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










QALARFYCYTERTIAKRLVLRRDPSVKRTLCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRFLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHSISDRLPEEKMQTQGSSNQ 


3172 


A 


2 


496 


FRRAGAGRGRRRGEVTSPLSPEPLAFQSLATSRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 

PAYPINIVALVFSIMSLNSYNDGDYEGARRLGRN 

AKWVAIASIIIGLLIIGISCAVHFTRNA 


3173 


A 


2 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

LLHLGLGSGGCREDVPPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPIN AESLGKS 

GSNLPISPKEHKLKDDSIVDVQNTESKKLSPPWE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEIEKSGTIPIAKPSETEQSETDCDVGEALDAS 

APIEQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSHATKKVQKNRNNYASVEC 

GAKILAANPEAKSTSAILIENMDLYMLNPCSTKI 

WFVIELCEPIQVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAILNMVNIAANILGAKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQFCSELTTICCIS 

SFSEYIYKWCSVRVALYRQRSRTALSKGKDYLV 

LAQPPLLLPAESVDVSVLQPLSGELENTNIEREAE 

TVVLGDLSSSMHQDDLVNHTVDAVELEPSHSQT 

LSQSLLLDITPEINPLPKIEVSESVEYEAGHIPSPVI 

PQESSVEIDNETEQKSESFSSIEKPSITYETNKVNE 

LMDNIIKEDVNSMQIFTKLSETIVPPINTATVPDN 

EDGEAKMNIADTAKQTLISVVDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQKESVFMPILNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKTIVKLQNTSRIAE 

EQDQRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

LKREVSDRQSYLVISLVLCVVLGLMLCMQRCRN 

TSQFDGDYISKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYIVEPLKFSP 

EKKKKRCKYKIEKIETKPEEPLHPIANGDIKGRK 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 

EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 

KVQDQGKLIKTLIQTKSGSLPSLHDIIKGNKEITV 

GTFGVTAVSGHI 


3174 


A 


485 


4668 


RKCSKEKASKTPSQKIPTTPCCVLQAGPEPRSLAE 
RMGADGETVVLKNMLIGVNLILLGSMIKPSECQL 
EVTTERVQRQSVEEEGGIANYNTSSKEQPWFNH 
VYNINVPLDhnLCSSGLEASAEQEVSAEDETLAEY 
MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 
ELLSRIEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C<»Cysteine, D^Asparttc Acid, 
£=GIutamic Acid, F=PhenylaIanine, G=Glycine, H=Hi$tidine, 
I=l5oleucine, K=Lysine, L=Leucine, M«Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginlne, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=po5sible'nu€leotide deletion, 
\==possible nucleotide insertion 










DYIPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 

ISDRSIELEWDGPMAVTEYVISYQPTALGGLQLQ 

QRVPGDWSGVTITELEPGLTYNISVYAVISNILSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFDGWEISFIPKIWEGGVIAQVPSDVTSFNQTGLK 

PGEEYIVNVVALKEQARSPPTSASVSTVIDGPTQI 

LVRDVSDTVAFVEWIPPRAKVDFELLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKVVYITLAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETSISLIWTKASGPID 

HYRITFTPSSGIASEVTVPKDRTSYTLTDLEPGAE 

YnSVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDITISNVTKDSVMVSWSPPV 

ASFDYYRVSYRPTQVGRLDSSVVPNTVTEFTITR 

LNPATEYEISLNSVRGREESERICTLVHTAMDNP 

VDLIATMTPTEALLQWKAPVGEVENYVIVLTHF 

AVAGEmVDGVSEEFRLVDLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEIENYVLTYKSTDGSRKELIVDAEDTWI 

RLEGLLENTDYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRK 

WADYRVGFGNVEDEFWLGLDNIHRITSQGRYEL 

RVDMRDGQEAAFASYDRFSVEDSRNLYKLRIGS 

YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 

MSYKGAWWYKNCHRTNLNGKYGESRHSQGIN 

WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 

SLQF 


3175 


A 


2 


623 


RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 

AATAEGT]VL\SGVTVNDEVIKVFNDMKVRKSST 

QEEIKKRKKAVLFCLSDDKRQIIVEEAKQILVGDI 

GDTVEDPYTSFVKLLPLNDCRYALYDATYETKE 

SKKEDLVFIFWAPBSAPLKSKMIYASSKDAIKKK 

FTGIKHEWQVNGLDDIKDRSTLGEKLGGNVVVS 

LEGKPL 


3176 


A 


99 


1567 


PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMVAFLGASAVTASTGLLWKR 

AHAESPPCVDNLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 

ADEGSIFYTLGECGLISFSDYIFLTTVLSTPQRNFE 

lAFKMFDLNGDGEVDMEEFEQVQSnRSQTSMG 

MRHRDRPTTGNTLKSGLCSALTTYFFGADLKGK 

LTIKNFLEFQRKLQHDVLKLEFERHDPVDGRITE 

RQFGGMLLAYSGVQSKKLTAMQRQLKKHFKEG 

KGLTFQEVENFFTFLKNINDVDTALSFYHMAGAS 
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SEQID 
NO: 

• 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanlne C^Cysteine, INAspartic Acid, 
E==Glutamic Acid, F=Phenylalanine, G»Glycine, H=Histidine, 
I=lsoleucine, K^Lysine, L^Leucine, M^Methionine, 
N=A$paragine, P=sProline, Q=Glutamine, R=Arginine, S=Serine, 
T°Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X-Unicnown, *"Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










LDKVTMQQVARTVAKVELSDHVCDVVFALFDC 
DGNGELSNKEFVSIMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 


3177 


•A 


182 


648 


LGVVGSGAAVGGRQAARGAALGRRPMAAVLG 

ALGATRRLLAALRGQSLGLAAMSSGTHRLTAEE 

RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA 

FGFMSRVALQAEKMNHHPEWFNVYNKVQITLTS 

HDCGELTKKDVKLAKFIEKAAASV 


3178 


A 


8 


612 


ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 

LRRPHLLHTPRAPTFRIRLGAHRGGSGELLENTM 

EAMENSMAQRSDLLELDCQLTRDRVVVVSHDE 

NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 

HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 

EELlREIAGLVRRYDRNEinWASEKSSVMKKCK 


3179 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKN 

LTKI^LYPNPKPEVLHMIYMRALQIVYGIRLEHF 

YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 

CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 

RETYMEFLWQYKSSADKMQQLNAAHQEALMK 

LERLDSVPVEEQEEFKQLSDGIQELQQSLNQDFH 

QKTIVLQEGNSQKKSNISEKTKRLNELKLSVVSL 

KEIQESLKTKIVDSPEKLKNYKEKMKDTVQKLK 

NARQEWEKYEIYGDSVDCLPSCQLEVQLYQKK 

IQDLSDNREKLASILKESLNLEDQIESDESBLKKL 

KTEENSFKRLMIVKKEKLATAQFKINKKHEDVK 

QYKRTVIEDCNKVQEKRGAVYERVTTINHEIQKI 

RLGIQQLKDAADREKLKSQEIFLNLKTALEKYHD 

GIEKAAEDSYAKIDEKTAELKRKMFKMST 


3180 


A 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 

WPLFIFLILISVRLSYPPYEQHECHFPNKAMPSAG 

TLPWVQGIICNANNPCFRYPTPGEAPGVVGNFNK 

SIVARLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQIKKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDKMLRADVILHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDILKPILRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRIVCGHPEGGGLBaKSLNWYEDNNYKAL 

FGGNGTEEDAETFYDNSTTPYCNDLMKNLESSPL 

SRnWKALKPLLVGKILYTPDTPATRQVMAEVNK 

TFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPEDVQSSNGSVYTWREAFNETNQAIRTISR 

FMECVNLNKXEPIATEVWLINKSMELLDERKFW 

AGIVFTGITPGSIELPHHVKYKIRMGIDNVERTNK 

IKDGYWDPGPRADPFEDMRYVWGGFAYLQDVV 

EQAIIRVLTGTEKKTGVYMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWFYSVAVIIKGIVYEKEARLK 

ETMRIMGLDNSILWFSWFISSLIPLLVSAGLLVVI 

LKLGNLLPYSDPSWFVFLSVFAVVTILQCFLISt 

LFSRANLAAACGGIIYFTLYLPYVLCVAWQDYV 

GFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQW 

DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKRISEICMEEEPTHLKLGVSIQNLVKVY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'=Alanine C»Cysteine» D»Aspartic Acid, 
E=Clutamic Acid, F=Phcnylalanine, G^lycine, H»Histidine, 
I^Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N-Asparagine, P*=Proline, Q=Glutamine, R=Arginine, S^^Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X»Unknown, *»Stop codon, ^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










RDGMKVAVDGLALNFYEGQITSFLGHNGAGKTT 

TMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLG 

VCPQHNVLFDMLTVEEfflWFYARLKGLSEKHVK 

AEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLS 

VALAFVGGSKVVILDEPTAGVDPYSRRGIWELLL 

KYRQGRTIILSTHHMDEADVLGDRlAnSHGKLCC 

VGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNS 

SSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTID 

VSAISNLIRKHVSEARLVEDIGHELTYVLPYEAA 

ICEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFL 

KVAEESGVDAETSDGTLPARRNRRAFGDKQSCL 

RPFTEDDA ADPNDSDIDPESRETDLLSGMDGKGS 

YQVKGWKLTQQQFVALLWKRLLIARRSRKGFF 

AQIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWM 

YNEQYTFVSNDAPEDTGTLELLNALTKDPGFGT 

RCMEGNPIPDTPCQAGEEEWTTAPVPQTIMDLFQ 

NGNWTMQNPSPACQCSSDKKKMLPVCPPGAGG 

LPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIA 

KSLKNKIWVNEFRYGGFSLGVSNTQALPPSQEV 

NDATKQMKKHLKLAKDSSADRFLNSLGRFMTG 

LDTRNNVKVWFNNKGWHAISSFLNVINNAILRA 

NLQKGENPSHYGITAFNHPLNLTKQQLSEVAPM 

TTSVDVLVSICVIFAMSFVPASFVVFLIQERVSKA 

KHLQFISGVKPVIYWLSNFVWDMCNYVVPATLV 

IIIFICFQQKSYVSSTNLPVLALLLLLYGWSITPLM 

YPASFVFKIPSTAYWLTSVNLFIGINGSVATFVL 

ELFTDNKLNNINDILKSVFLIFPHFCLGRGLIDMV 

KNQAMADALERFGENRFVSPLSWDLVGRNLFA 

MAVEGVVFFLITVLIQYRFFIRPRPVNAKLSPLND 

EDEDVRRERQRILDGGGQNDILEIKELTKIYRRK 

RKPAVDRJCVGIPPGECFGLLGVNGAGKSSTFKM 

LTGDTTVTRGDAFLNRNSILSNIHEVHQNMGYCP 

QFDAITELLTGREHVEFFALLRGVPEKEVGKVGE 

WAIRKLGLVKYGiEKYAGNYSGGNKRKLSTAMA 

LIGGPPVVFLDEPTTGMDPKARRPLWNCALSVV 

KEGRSVVLTSHSMEECEALCTRMAIMVNGRFRC 

LGSVQHLKNRFGDGYTIVVRIAGSNPDLKPVQDF 

FGLAFPGSVPKEKHRNMLQYQLPSSLSSLARIFSI 

LSQSKKRLfflEDYSVSQTTLDQVFVNFAKDQSDD 

DHLKDLSLHKNQTVVDVAVLTSFLQDEKVKESY 

V 


3181 


A 


215 


1367 


PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDNVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAHKYWNDFYKIHENGFFKDR 

HWLFTEFPELAPSQNQNHLKDWFLENKSEVPEC 

RNNEDGPGLIMEEQHKCSSKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRILEVGCGVGNTV 

FPELQTNNDPGLFVYCCDFSSTAIELVQTNSEYDP 

SRCFAFVHDLCDEEKSYPVPKGSLDniLIFVLSAl 

VPDKMQKAINRLSRLLKPGGMVLLRDYGRYDM 

AQLRFKKGQCLSGNFYVRGDGTRVYFFTQEELD 

TLFTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 

WIQCKYCKPLLSSTS 


3182 


A 


3 


1289 


GSETQHLPRDPQHLPWDPQQHQDRRRPELFHAF 
ARDSAPPPSMVLAAETTSQQERLQAIAEKRKRQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Add, 
E==GIutamic Acid, F^Phenylalanine, G^Glycine, H==Histidinc, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Prollne, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyroslne, 
X^Un known, *=Stop codon, /"possible nucleotide deletion, 
V^possible nucleotide insertion 










AEIENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKENAAAPSPVRAPAPSPA 

KEERKTEVVMNSQQTPVGTPKDKRVSNTPLRTV 

DGSPMMKAAMYSVEITVEKDKVTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKVVHAVDGTAENGIHP 

LSSSEVDELIHKADEVTLSEAGSTAGAAETRGAV 

EGAARTTPSRREITGVQAQPGEATSGPPGIQPGQE 

PPVTMIFMGYQNVEDEAETKKVLGLQDTITAEL 

WIEDAAEPKEPAPPNGSAAEPPTEAASREENQA 

GPEATTSDPQDLDMKKHRCKCCSIM 


3183 


A 


333 


1931 


lAPTGGSHSEIQKQLGSGGDSSSQRRAERRTEPRS 

APRPRWGRSARSPGAHKLPGPPRRRDPGAWARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLLALWGGLLPPRTELPASRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR 

GAQWAQVQEELRAAHWTEGSVVSLTRWLPNLT 

DVVVPAPWRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

RVMQRATSNLHRGPGGALVFLDNEAGLVHGYR 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARLLRLYRRHEPRFPELAALADPHAQLL 

QRRLDFLAKHILHCKAKYGRRSGDLVSPGGKER 

DLGLGYG 


3184 


A 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLVAA 

ALLVGFILFLTRSRGRAASAGQEPLHNEELAGAG 

RVAQPGPLEPEEPRAGGRPRRRRDLGSRLQAQR 

RAQRVAWAEADENEEEAVILAQEEEGVEKPAET 

HLSGKIGAKKLRKLEEKQARKAQREAEEAEREE 

RKRLESQREAEWKKEEERLRLEEEQKEEEERKA 

REEQAQREHEEYLKLKEAFVVEEEGVGETMTEE 

QSQSFLTEFINYIKQSKVVLLEDLASQVGLRTQD 

TINRIQDLLAEGTITGVIDDRGKFIYITPEELAAVA 

NFIRQRGRVSIAELAQASNSLIAWGRESPAQAPA 


3185 


A 


2981 


7173 


CLLAGKFSSTLYETGGCDMSLVNFEPAARRASNI 

CDTDSHVSSSTSVRFYPHDVLSLPQIRLNRLLTID 

TDLLEQQDIDLSPDLAATYGPTEEAAQKVKHYY 

RFWILPQLWIGINFDRLTLLALFDRNREILENVLA 

VILAJLVAFLGSILLIQGFFRDIWVFQFCLVIASCQ 

YSLLKSVQPDSSSPRHGHNRnAYSRPVYFCICCG 

LIWLLDYGSRNLTATKFKLYGITFTOPLVFISAKD 

LVIVFTLCFPIVFFIGLLPQVNTFVMYLCEQLDIHI 

FGGNATTSLLAALYSFICSIVAVALLYGLCYGAL 

KDSWDGQHIPVLFSIFCGLLVAVSYHLSRQSSDP 

SVLFSLVQSKIFPKTEEKNPEDPLSEVKDPLPEKL 

RNSVSERLQSDLVVCIVIGVLYFAIHVSTVFTVLQ 

PALKYVLYTLVGFVGFVTHYVLPQVRKQLPWH 

CFSHPLLKTLEYNQYEVRNAATMMWFEKLHVW 

LLFVEKNDYPLIVLNELSSSAETIASPKKLNTELG 

ALMITVAGLKLLRSSFSSPTYQYVTVIFTVLFFKF 

DYEAFSETMLLDLFFMSELFNKLWELLYKLQFVY 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lorn ti fin 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=GIutamic Acid, F=Fbenylalanine, G>=Glycine, H^^Histidine, 
l=Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GlutaminC) R=Arginine, S=^rine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *<=^top codon, /^possible nucleotide deletion, 
Vr^possible nucleotide insertion 










TYIAPWQITWGSAFHAFAQPFAVPHSAMLFIQAA 

VSAFFSTPLNPFLGSAMTSYVRPVKFWERDYNT 

KRVDHSNTRLASQLDRNPGTYCQQREVEAITEG 

VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 

VIVTKYILEGYSITDNSAASMLQVFDLRXVLTTY 

YVKGIIYYVTTSSKLEEWLANETMQEGLRLCAD 

RNYVDVDPTFNPNIDEDYDHRLAGISRESFCVIY 

LNWIEYCSSRRAKPVDVDKDSSLVTLCYGLCVL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SIRDEWIFADMELLRKVVVPGIRMSIKLHQDHFT 

SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 

AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 

SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 

GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 

SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSSTSQISLRNLPSSIQSRLSMVNQ 

MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATBGGQSSATDAQPGNTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 

KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 

TRSHIDKAVLLVQIDDKYVTVIETGVLELGAEV 


3186 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAVVHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3187 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3188 


A 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEM 

EEMIEQLQEKVHELEKQNDTLKNRLISAKQQLQT 

QGYRQTPYKNVQSRINTGRRKANENAGLQECPR 

KGIKFQDADVAETPHPMFTKYGNSLLEEARGEIR 

NLENVIQSQRGQIEELEHLAEILKTQLRRKENEIE 

LSLLQLREQQATDQRSNIRDNVEMIKLHKQLVE 

KSNALSAMEGKFIQLQEKQRTLKISHDALMANG 

DELNMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 

QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 

WKLKEQQLKVQL\QLETALKSDLTDKTEILDRL 

KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 

RIKLYNQENDINADELSEALLLIKAQKEQKNGDL 

SFLVKVDSEINKDLERSMRELQATHAETVQELEK 

TRNMLIMQHKmKDYQMEVEAVTRKMENLQQD 

YELKVEQYVHLLDIRAARIHKLEAQLKDUYGTK 

QYKFKPEIMPDDSVDEFDETIHLERGENLFEIHIN 

KVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTP 

WRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITL 

EVHQAYSTEYEHAACQLKFHEILEKSGRIFCTAS 

LIGTKGDIPNFGTVEYWFRLRVPMDQAIRLYRER 

AKALGYITSNFKGPEHMQSLSQQAPKTAQLSSTD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

toUrstamiho 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycinc H»Histidlne, 
I=Isoleucine, K=Lysine, L?=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


• 








STDGNLNELHITIRCCNHLQSRASHLQPHPYVVY 

KFFDFADHDTAIffSSNDPQFDDHMYFPVPMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLAHDRCISGDFELTDHQKHPAGTIHVILKWKFA 

YLPPSGSITTEDLGNFIRSEEPEWQRLPPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEVEEDMSASDSDDCIIPGPI 

SKMKQPSEKIRffillALSLlSIDSQVTMDDTIQRLFV 

ECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIY 

VDKENNKAKRDILKAILQKQEMPNRSLRFTVVS 

DPPEDEQDLECEDIGVAHVDLADMFQEGRDLIE 

QNIDVFDARADGEGIGKLRVTVEALHALQSVYK 

QYRDDLEA 


3189 


A 


476 


1175 


MKGSGWHLRSGMVGTLITTILPHWRRTAHVGTN 

ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

LPQDLQAARALMGISCLLSGIACACAVIGMKCTR 

CAKGTPAKTTFAILGGTLFBLAGLLCMGAVSWTT 

NDVVQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCLSCQDEAPYRPYQAPPRATTTTANTAP 

AYQPPAAYKDNRAPSVTSATHSGYRLNDYV 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 

VKTQDGPKPINPLICSLQCQAALLPSEEWERCQSF 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 

GSFLKELEKSKFLPSISTKENTLSKSLEEKLRGLS 

DGFREGAESELMRDAQLNDGAMETGTLYLAEE 

DPKEQVKRYGGFLRKYPKRSSEVAGEGDGDSM 

GHEDLYKRYGGFLRRIRPKLKWDNQKRYGGFLR 

RQFKVVTRSQEDPNAYSGELFDA 


3191 


A 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

FTTGDAGASSTYPMQCSALRKNGFWLKGRPCK 

IVEMSTSKTGKHGHAKVHLVGIDIFTGKKYEDIC 

PSTHNMDVPNIKRNDYQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 

MSEEYAVAIKPCK 


3192 


A 


105 


1661 


KVSADGMQSCESSGDSADDPLSRGLRRRGQPRV 

WIGAGLAGLAAAKALLEQGFTDVTVLEASSHIG 

GRVQSVKLGHATFELGATWIHGSHGNPIYHLTE 

ANGLLEETTDGERSVGRISLYSKNGVACYLTNH 

GRRIPKDWEEFSDLYNEVYNLTQEFFRHDKPVN 

AESQNSVGVFTREEVRNRIRNDPDDPEATKRLKL 

AMIQQYLKVESCESSSHSMDEVSLSAFGEWTEIP 

GAHHIIPSGFMRVVELLAEGIPAHVIQLGKPVRCI 

HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 

EEPRGGRWDEDEQWSVWECEDCELIPADHVIV 

TVSLGVLKRQYTSFFRPGLPTEKVAAIHRLGIGTT 

DKIFLEFEEPFWGPECNSLQFVWEDEAESHTLTY 

PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 

LVMEKCDDEAVAEICTEMLRQFTGNPNIPKPRRJ 

LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 

LPYTESSKTATK 


3193 


A 


1 


1928 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQT 
ANLSWFKDSNSTTPLIFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSIERGK 
WVFFQNCHLAPSWMPALERLIEHINPDKVHRDF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

(ocfltion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence <A»Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysinc, L=Leuctne, M=Methionine, 
N=Asparagine, P=ProIlne, Q=Glutamine, R=Arginine, SF=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletipn, 
V^possible nucleotide insertion 










RLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFKSLLLSLCLFHG 

NALERRKFGPLGFNIPYEFTDGDLRICISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTDDWDRRCI 

MNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYLSYIKSLPLNDMPEIFGLHDNANITFAQNETFA 

LLGTIIQLQPKSSSAGSQGREEIVEDVTQNILLKVP 

EPINLQWVMAKYPVLYEESMNTVLVQEVIRYNR 

LLQVITQTLQDLLKALKGLVVMSSQLELMAASL 

YNNTVPELWSAKAYPSLKPLSSWVMDLLQRLDF 

LQAWIQDGIPAVFWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYfflG 

LFLEGARWDPEAFQLAESQPKELYTEMAVrWLL 

PTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHST 

NYVIAVEIPTHQPQRHWIKRGVALICALDY 


3194 


A 


1 


1023 


DGWTPVHAAVDTGNVDSLKLLMYHRIPAHGNS 

FNEEESESSVFDLDGGEESPEGISKPVVPADLINH 

ANREGWTAAHIAASKGFKNCLEBLCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLENLNALKIPLRIS 

VGEIEPSNYGSDDLECENTICALNIRKQTSWDDFS 

KAVSQALTNHFQAISSDGWWSLEDVTGNNTTDS 

NIGLSARSIRSITLGNVPWSVGQSFAQSPWDFMR 

Khn<AEHITVLLSGPQEGCLSSVTYASMIPLQMM 

QNYLRLVEQYHNVIFHGPEGSLQDYIVHQLALCL 

KHRQMGWQDSPVEIVEELEVGCWFFPREQLLRT 

CSLVA 


3195 


A 


1 


1809 


MAASAQVSVTFEDVAVTFTQEEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDLSQSSYPGDNTKPKTTEPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

HLKIGIGPQRGKLLEKMSSERDGLGSDDGVCTKI 

TQKQVSTEGDLYECDSHGPVTDALIREEKNSYK 

CEECGKVFKKNALLVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHLIIHTGEKPYKCMECGKAFNRR 

SHLTRHQRIHSGEKPYKCSECGKAFTHRSTFVLH 

HRSHTGEKPFVCKECGKAFRDRPGFIRHYIIHTGE 

KPYECIECIECGKAFNRRSYLTWHQQIHTGVKPF 

ECNECGKAFCESADLIQHYIIHTGEKPYKCMECG 

KAFNRRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKRTHTGEKPYECKECGKAFSDRADLIR 

HFSIHTGEKPYECVECGKAFNRSSHLTRHQQIHT 

GEKPYECIQCGKAFCRSANLIRHSnHTGEKPYEC 

SECGKAFNRGSSLTHHQRIHTGRNPTIVTDVGRP 

FMTAQTSVNIQELLLGKEFLNITTEENLW 


3196 


A 


1400 


264 


VGFWERPLRSSRWFRHSLRRWEMLARAARGTG 

ALLLRGSLLASGRAPRRASSGLPRNTVVLFVPQQ 

EAWWERMGRFHRILEPGLNILIPVLDRIRYVQSL 

KEIVINVPEQSAVTLDNVTLQIDGVLYLRIMDPY 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

FRERESLNASIVDAINQAADCWGIRCLRYEIKDIH 

VPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQILASEAEKAEQINQAAGEASAVL 

AKAKAKAEAIRILAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNTILLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, I>=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G'=Glycine, H=Histidine, 
I=]soleucine, K^^Lysine, L^Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamlne, R=Arginine, S=Serine, 
T=Threonine, V=VaUnc, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ELDRVKMS 


3197 


A 


66 


3632 


LWECAAAAAGQRDGGVTLFLKGRVLGRRCAAS 

LFAREVCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRKMWRPGEKKEPQGVVYEDVRD 

DTEDITCEPLKVVFEGSAYGLQNFNKQKKLKTCD 

DMDTFFLHYAAAEGQEELMEKITRDSSLEVLHE 

MDDYGNTPLHCAVEKNQIESVKFLLSRGANPNL 

RNFNMMAPLHIAVQGMNNEVMKVLLEHRTDDV 

ISnLEGENGNTAVIIACTTNNSEALQILLNKGAKJ'C 

KSNKWGCFPIHQAAFSGSKECMEIILRFGEEHGY 

SRQLHmFMlWGKATPLHLAVQNGDLEMIKMCL 

DNGAQIDPVEKGRCTAIHFAATQGATEIVKLMIS 

SYSGSVDIVNTTDGCHETMLHRASLFDHHELAD 

YLISVGADINKIDSEGRSPLILATASASWNIVNLL 

LSKGAQVDIKDNFGRNFLHLTVQQPYGLKNLRP 

EFMQMQQIKELVMDEDNDGCTPLHYACRQGGP 

GSVNNLLGFNVSIHSKSKDKKSPLHFAASYGRIN 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAAKNG 

HDKWQLLLKKGALFLSDHNGWTALHHASMGG 

YTQTMKVILDTNLKCTDRLDEDGNTALHFAARE 

GHAKAVALLLSHNADIVLNKQQASFLHLALHNK 

RKEWLTIIRSKRWDECLKIFSHNSPGNKCPITEM 

lEYLPECMKVLLDFCMLHSTEDKSCRDYYIEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

riellnhpvckeyllmkwlaygfrAhmmnlgs 
yclglipmtilvvnikpgmafkstgiinetsdhsei 
ldttnsyliktcmilvflssifgyckeagqifqqk 
rnyfmdisnvlewnyttgiifvlplfveipahlq 

WQCGAIAVYFYWMNFLLYLQRFENCGIFIVMLE 

VILKTLLRSTVVFIFLLLAFGLSFYILLNLQDPFSS 

PLLSIIQTFSMMLGDINYRESFLEPYLRNELAHPV 

LSFAQLVSFTIFVPIVLMNLLIGLAVGDIAEVQKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGEIRQEIPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELIKLIIQKMEn 

SETEDDDSHCSFQDRFKKEQMEQRNSRWNTVLR 

AVKAKTHHLEP 


3198 


A 


51 


2177 


KEKSLHHVDQRPPLWHPGRPGTSQSAAMNASSE 

GESFAGSVQIPGGTTVLVELTPDIHICGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

WATQTQTTTRTITSETQTITVSAPEFVFEHGYQT 

YLPTESNENQTATXaSLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

cevcgkcfsrkdklkthmrchtgvkpykcktc 

dyaaadssslnkhlrihsderpfkcqicpyasrn 

ssqltvhlrshtgdapfqcwlcsakfkissdlkr 

hmrvhsgekpfkcefcnvrctmkgnlkshirik 

hsgnnfkcphcaflgdskatlrkhsrvhqsehr 

ekcsecsyscsskaalriherihctvrpfkcnycs 

fdskqpsnlskhmkkfhgdmvktealerkdtg 

rqssrqvakldakksfhcdicdasfmredslrs 

hkrqhseynesknsdvtvlqfqidpskqpatplt 

vghlqvplqpsqvpqfsegrvknvghqvpqant 

ivqaaaaavnivppalvaqnpeelpgnsrlqilr 

qvsliappqssrcpseagamtqpavllttheqtd 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locstion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 


Amino acid sequence (A^AIanine O^Cysteine, D=Aspartic Add, 
E^Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=*Lysine, L=L€ucine, M=Methionine, 
N=A5paragine, P=ProIlne, Q=Glutamine, R=Arginine, S=Serine, 
T«Threonlne, V=VaUne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /<=possible nucleotide deletion, 
\=possible nucleotide insertion 










GATLHQTLIPTASGGPQEGSGNQTFITSSGITCTD 

FEGLNALIQEGTAEVTWSDGGQNIAVATTAPPV 

FSSSSQQELPKQTYSnQGAAHPALLCPADSIPD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 
WSILQGTDWSILQSADWCIYNPLARHRALTGVFL 
QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 
RPNKKDSAERNHRPAREGSVAQRQPNPAALEKA 
EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 
GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 
VRRLLRRLVGALVAEAGFCYVQVAEGQRWGV 
LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 
LCSLPIAVAALLCPGSGPGAQSGLEFVERPPPSPL 
AVVLARWPLPPPAGRCPRDAPEARVPEKARAEG 
. SERENNYGCG WGGEMTTLVLDNGAYNAKIGY 
SHENVSVIPNCQFRSKTARLKTFTANQIDEIKDPS 
GLFYILPFQKGYLVNWDVQRQVWDYLFGKEMY 
QVDFLDTOniTEPYFNFTSIQESMNEILFEEYQFQ 
AVLRVNAGALSAHRYFRDNPSELCCIIVDSGYSF 
TfflVPYCRSKKKKEAIIRINVGGKLLTNHLKEIISY 
RQLHVMDETHVINQVKEDVCYVSQDFYRDMDI 
AKLKGEENTVMIDYVLPDFSTIKKGFCKPREEMV 
LSGKYKSGEQILRLANERFAVPEILFNPSDIGIQE 
MGIPEAIVYSIQNLPEEMQPHFFKNIVLTGGNSLF 
PGFRDRVYSEVRCLTPTDYDVSVVLPENPITYAW 
EGGKLISENDDFEDMWTREDYEENGHSVCEEK 
FDI 


3200 


A 


3 


307 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTR 

SCAGISGKSQLLFALVFTTRYLDLFTSFISLYNTS 

MKVWYAIHRNVFHLQCTGLWTLNLCQLCIFN 


3201 


A 


1 


469 


IRHEGRGQRGKMELVQVLKRGLQQITGHGGLRG 

YLRVFFRTNDAKVGTLVGEDKYGNKYYEDNKQ 

FFGRHRWVWTTEMNGKNTFWDVDGSMVPPE 

WHRWLHSMTDDPPTTKPLTARKFIWTNHKFNVT 

GTPEQYVPYSTTRKKIQEWIPPSTPYK 


3202 


A 


144 


840 


NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM 
PQWRVSAFIENNIVVFENFWEGLWMNCVRQANI 
RMQCKIYDSLLALSPDLQAARGLMCAASVMSFL 
AFMMAILGMKCTRCTGDNEKVKAHILLTAGIIFII 
TGMWLIPVSWVANAIIRDFYNSIVNVAQKRELG 
EALYLGWTTALVLIVGGALFCCVFCCNEKSSSYR 
YSIPSHRTTQKSYHTGKKSPSVYSRSQYV 


3203 


A 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVAHR 
VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 
VPEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAIL 
LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 
ATAALLRHKVKARLTKKDS 


3204 


A 


1808 


668 


PESAPLPAFISSRILPAAWRNWCSYVVTRTISCHV 

QNGTYLQRVLQNCPWPMSCPGSSYRTVVRPTYK 

VMYKIVTAREWRCCPGHSRVSCEEVAGSSASLE 

PMWSGSTMRRMALRPTAFSGCLNCSKVSELTER 

LKVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lUv«tllUI] 

corresponding 
to first amino 
acid residue of . 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, INAspartic Acid, 
E=Glutaniic Acid, F=Pheny!alanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
l^Asparagine, P=ProIine, Q=GIutamine, R^Arginioe, S=^erine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 










DPGEKSHWGEGLHQLREALKILAERVLILETMIG 
LYEPELGSGAGPAGTGTTSLLRGKRGGHATNYRI 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQSVFNDSQEHLERFYCWENDRMRM 

KYGGQEFWADLNAMNVYETTEFDQLRRLSTPPS 

SNVNSIYHTVWKFFCRDHFGWREYPESVIRLffiE 

ANSRGLKEVRFMMWNNHYILHNSFFRREK^ 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

YRIIYNLFHKTVPEFKYRILQILRVQNQFLWEKY 

KRKKEYMNRKMFGRDRIIbreRHLFHGTSQDVVD 

GICKHNFDPRVCGKHATMFGQGSYFAKKASYSH 

NFSKKSSKGVHFMFLAKVLTGRYTMGSHGMRR 

PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 

YPYFVIQYEEVSNTVSI 


3206 


A 


297 


4500 


CLVDSKLWKGARSVYHQLFMSSLLMDLKYKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFWPSLARMLITEENLMSniKTFMDHLRHRDAQ 

GRFQFERYTALQAFKFRRVQSLILDLKYVLISKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQHEEMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLIEAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVEHRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLffiHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDVVMLQTGVSMMDPNHFLMIMLSRFELY 

QIFSTPDYGKRFSSEITHKDWQQNNTLIEEMLYL 

IIMLVGERFSPGVGQVNATDEIKREIIHQLSIKPM 

AHSELVKSLPEDENKETGMESVffiAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVTEEHVVTFTFTQKISKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRWIL 

KTFNAVKXMRESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFIDEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFVQRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFNNRLNFSDQPNLTQWIRTISQQIKALQFLRKE 

ESTPNNASTKNSENVDELQLPEGFRPDFRPKIPYS 

ESDGEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYTIQSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSVVQGHFCKPFASLVPND 

SHEELPCELDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLHIFHLVTMAHnQILLTSCTEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEIPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLIES 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCiCQTELEGEDVGACTAHTYSCGSGVGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 



284 



wo 01/57190 PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

flcid residue of 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
£»Glutamic Acid, F-Pbenylaianine, G-Glycine, H^Histidine, 
l^lsoleudne, K»Lysine, Lr^Leucine, M=Methionine, 
^Asparagine, P=Proline, Q=Glutaniine, R^Arginine, S=Serine, 
T=Threonine, V«Valine, W^Tryptophan, Y«^yroslnc, 
X'^Unknown, *»Stop codon, A^ossible nucleotide deletion, 
^possible nucleotide insertion 










RRGNPLHLCKERFKKIQKLWHQHSVTEEIGHAQ 
EANQTLVGIDWQHL 


3207 


A 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 

TTVSTQRGPVYIGELPQDFLRITPTQQQRQVQLD 

AQAAQQLQYGGAVGTVGRLNITVVQAKLAKNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 

NKVIHCTVPPGVDSFYLEIFDERAFSMDDRIAWT 

HITIPESLRQGKVEDKWYSLSGRQGDDKEGMINL 

VMSYALLPAAMVMPPQPVVLMPTVYQQGVGY 

VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 

DLKAIQDMFPNMDQEVIRSVLEAQRGNKDAAIN 

SLLQMGEEP 


3208 


A 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 

VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVW 

GLADLLSKHDSQHKLSEVITGDLLIIMAQUVAIQ 

MVLEEKFVYKHNVHPLRAVGTEGLFGFVILSLLL 

VPMYYIPAGSFSGNPRGTLEDALDAFCQVGQQP 

LIAVALLGNISSIAFFNFAGISVTKELSATTRMVL 

DSLRTVVIWALSLALGWEAFHALQILGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS 


3209 


A 


104 


1999 


AKVVSLKEFSCFWRREKPVSSLSSLQVKAEASW 

DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 

DMQLVLRKRICVNVHGRQGFAQSLLKKMSHRSS 

IPGCGVTFEIVSNIPEDAQGVEEREALARMAANV 

ENPASADSEAYIEKYLRSVLAVENLLTLDRLRQE 

VAVKEQLTGKGKLSRRSISSPNVNRLSGSRQDLIP 

SYSLGSNKGRWESQQDVSQTTVSRGIAPAPALSV 

SPQNNHSPDPGLSNLAASYLNPVKSFVPQMPKLL 

KSLFPVRDEKRGKRPSPLAHQPVPRIMVQSASPDI 

RVTRMEEAQPEMGPDVLVQTMGAPALKICDKP 

AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG 

YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 

EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 

EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

GAEGNAPAPGAGGQALASDSEEADEVPEWLREG 

EFVTVGAHKTGVVRYVGPADFQEGTWVGVELD 

LPSGKNDGSIGGKQYFRCNPGYGLLVRPSRVRR 

ATGPVRRRSTGLRLGAPEARRSATLSGSATNLAS 

LTAALAKADRSHKNPENRKSWAS 


3210 


A 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALWS 
GGIVGYVKTGSVPSLAAGLLFGSLAGLGAYQLY 
QDPRNVWGFLAATSVTFVGVMGMRSYYYGKF 
MPVGLIAGASLLMAAKVGVRMLMTSD 


3211 


A 


1078 


594 


VGMELPAVNLKVILLGHWLLTTWGCIVFSGSYA 
WANFTILALGVWAVAQRDSIDAISMFLGGLLATI 
FLDIVfflSIFYPRVSLTDTGRFGVGMAILSLLLKPL 
SCCFVYHMYRERGGELLVHTGFLGSSQDRSAYQ 
TIDSAEAPADPFAVPEGRSQDARGY 


3212 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 
ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT | 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
I>Glutamic Acid, F=Phenylalanine, G=GIycine, H^Histidine, 
I=Isoleucine, K=Lysine, l/=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudcotide deletion, 
\-possibIe nucleotide insertion 










AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSIXjMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVUTGSSDSTVRVWDVNTGEMLNTLIHHCEA 

\a.HLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDHTAL 

pargldhiaenilsyldakslcaaelvckewyr 

vtsdgmlwkkliermvrtdslwrglaerrgwg 

qylfknkppdgnappnsfyralypkiiqdieties 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTBLIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3214 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVDTGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLWSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D-Aspartic Acid> 
£=Glutamic Acid, F^Phenylalanine, G=Glyclne, H=Hlstidine, 
I=»Isoleucine, K=Lysine, L^Leuclne, M=Methlonlne, 
N=Asparagine, P=Pro!ine, Q=GIutamlne, R=Arginine, S=>Serine, 
T=Threonine, V=VaIlne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codoo, /"^possible nucleotide deletion, 
\=possible nucleotide insertion 










LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 
PAAQSEPPRSPSRTYTYISR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRWTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKFVLINWTGEGVNDVRKGACASHVS 

TMASFLKGAHVTINARAEEDVEPECIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEIKRVGKDSFWAKAEKEEENRRLEEKRRAEEA 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEWSRNRNEQESAVHPREIFKQKERA 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 

PEQETFYEQPPLVQQQGAGSEHIDHHIQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITGIEVIDE 

GWWRGYGPDGHFGMFPANYVELIE 


3216 


A 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADLSW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMTRVRWDNSALGNSPYHRAPRCIHVYKKN 

GVGKVGDQILLAIKGQKKKALIVGHCMPGPRMT 

PRFDSNNVVLIEDNGNPVGTRIKTPIPTSLRKREG 

EYSKVLAIAQNFV 


3217 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

Mhm^QKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAKITPEKVCKFIRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKJU.LTVSSHNLESKSTKJRDILVAFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3218 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNVVQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAKITPEKVCKFIRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKRLLTVSSHNLESKSTKRDILVAFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3219 


A 


1623 


572 


TSAEGWKGCTCTFKDRSKLREHLRSHTQEKVVA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'=Alanine C^Cysteine, D='Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H^Hlstidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Met{iionine, 
N-Asparagine, P=Prollne, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X'='UnknowD, *=^top codon, A^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKRFATERLLIU^HMRNHVNHYKCPLCDMTCPL 

PSSLRNHMRFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSIKSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 


3220 


A 


2760 


745 


SLGIPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 

GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 

YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 

GRPTWALRPEDGEDKEMKTYRLDAGDADPRRL 

CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 

PRTPGPPRSrPLEENWDREQIDFLAARQQFLSLE 

QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP 

HLANGHVVPDCPQVKGVVREENKVRAVPTWAS 

VQVVDDPGSLASVESPGTPKETPIEREIRLAQERE 

ADLREQRGLRQATDHQELVEIPTRPLLTKLSLITA 

PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 

GRASTPDWVSEGPQPGLRRALSSDSILSPAPDAR 

AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 

FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 

GKPLSTKQEASKPPRGCPQANRGVVRWEYFRLR 

PLRFRAPDEPQQAQVPHVWGWEVAGAPALRLQ 

KSQSSDLLERERESVLRREQEVAEERRNALFPEV 

FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSPI 

HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 

PSDGINSEVLEAIRVTRHKNAMAERWESRIYASE 

EDD 


3221 


A 


15 


478 


SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVINCADNTGAKNLYIISVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDGVFLYFEDNAGVIVNNKGEMKGSAITGP 
VAKECADLWPRIASNAGSIA 


3222 


A 


207 


1321 


PLIPLHPANRSPATMAELQEVQITEEKPLLPGQTP 

EAAKTHSVETPYGSVTFTVYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEnQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTIIGVGVGAGAYILARYALNHPDTVEGLVLINI 

DPNAKGWMDWAAHKLTGLTSSIPEMILGHLFSQ 

EELSGNSELIQKYRNIITHAPNLDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLVVGDQAPHEDAVV 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGYLDDCTCDVETIDRFNN 

YRLFPRLQKLLESDYFRYYKVNLKRPCPFWNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLIEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTIKRPLNPLASGQG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C-sCysteine, X)=A$partic Acid, 
E°=Glutamic Acid, F=Phenylalanine, G»Glycine, H^Histidine, 
Msolcucine, K«Lysine, L=Leucine, M^Metliionine, 
N^Asparagine, P»Proline, Q^Glutamine, R^Arginine, S»Sertne» 
T=Threomne, V=Valine, W==Tryptophan, Y=Tyrosinc, 
X»Unkno>vn, ^'^top codon, /^possible nucleotide deletion, 
\=:pos5ible nucleotide insertion 










TSEENTFYSWLEGLCVEKRAFYRLISGLHASINV 

HLSARYLLQETWLEKKWGHNITEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFQL 

FTGNKIQDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHKLKEDFRLHFRNISRIMDCVGCFKC 

RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 

EFHLTRQEIVSLFNAFGRISYKCERIRKTSRNLLQ 

NIH 


3224 


A 


2 


803 


PGSTISWDRDAAGESGTRAASPSPSGSRTAGRLP 

SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 

LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 

TIGIDFLSKTMYLEDRTVRLQLWDTAGQERFRSL 

IPSYIRDSTVAVVVYDITNLNSFQQTSKWIDDVRT 

ERGSDVIIN4LVGNKTDLADKRQITIEEGEQRAKE 

LSVMFIETSAKTGYNVKQLFRRVASALPGMENV 

QEKSKEGMroiKLDKPQEPPASEGGCSC 


3225 


A 


3 


5054 


PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKRVA 

VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 

GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 

GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 

SSNNGTSPNPIfflWDKVIVDGSDMEEWPCIASKD 

TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 

GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 

TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 

PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 

QTSREQQSKMENAGVNFWSGREQAQIHNTDGP 

KNGNTNSLNLSSPNPMENKGMPFGMGLGNTSRS 

TDAPSQSTGDRKTGSVGSWGAARGPSGTDTVSG 

QSNSGNNGNNGBCEREDSWKGASVQKSTGSKND 

SWDNNNRSTGGSWNFGPQDSNDNKWGEGNKM 

TSGVSQGEWKQPTGSDELKIGEWSGPNQPNSST 

GAWDNQKGHPLLENQGNAQAPCWGRSSSSTGS 

EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 

QAVLQTLLSRTDLDPRVLSNTGWGQTQIKQDTV 

WDIEEVPRPEGKSDKGTEGWESAATQTKNSGG 

WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 

WNDYKNNNSSNWGGGRPDEKTPSSWNENPSKD 

QGWGGGRQPNQGWSSGKNGWGEEVDQTKNSN 

WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 

SKGGWEDCKRSPAWNETGRQPNSWNKQHQQQ 

QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 

WSSGPQPATPKDEEPSGWEEPSPQSISRKMDBDD 

GTSAWGDPNSYNYKNVNLWDKNSQGGPAPREP 

NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 

PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKCSLKGGNNDSWMNPLAKQFSNMGLL 

SQTEDNPSSKMDLSVGSLSDKKFDVDKRAMNLG 

DFNDIMRKDRSGFRPPNSKDMGTTDSGPYFEKG 

GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAQVP 

PQFISPQVSASMLKQFPNSGLSPGLFNVGPQLSPQ 

QIAMLSQLPQIPQFQLACQLLLQQQQQQQLLQN 

QRKISQAVRQQQEQQLARMVSALQQQQQQQQR 

QPGMKHSPSHPVGPKPHLDNMVPNALNVGLPDL 

QTKGPIPGYGSGFSSGGMDYGMVGGKEAGTESR 

FKQWTSMMEGLPSVATQEANMHKNGAIVAPGK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine C-Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F^Phenylalanine, G^GIycine, H=Histldine, 
I=lsoleucine, K=Ly$ine, L=Leucine, M=Methionine, 
N^Asparaginc, P=ProIine, Q=Glutaminc, R«Arginine, S^Serine, 
T=Threomne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X«=Unknown, *»Stop codoD, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










TRGGSPYNQFDIIPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLWDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHNPTHLSNKMWKNmSSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGNTTILAEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 


VPWKRQDEQLSLQVETLYLDSPAVIHLLSPTFLP 

PSSLPPFLQIVDSSSSACTLDSFFPFLAPWDSPQDC 

GFKDHQPLTLQALTVELARWTLMLLLSTAMYG 

AHAPLLALCHVDGRVPFRPSSAVLLTELTKLLLC 

AFSLLVGWQAWPQGPPPWRQAAPFALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPAAAASPMPLHITPLGLLLLILYCLI 

SGLSSVYTELLMKRQRLPLALQNLFLYTFGVLLN 

LGLHAGGGSGPGLLEGFSGWAALVVLSQALNGL 

LMSAVMKHGSSITRLFVVSCSLVVNAVLSAVLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 


3227 


A 


1 


679 


RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLLAALGSGRAERDCRVSSFRVKENFDKARFSGT 

WYAMAKKDPEGLFLQDNIVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTDTEDPAKFK 

MKYWGVASFLQKGNDDHWIVDTDYDTYAVQY 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKIV 

RQRQEELCLARQYRLIVHNGYCDGRSERNLL 


3228 


A 


430 


1104 


QQESPAAGAARMNCKEGTDSSCGCRGNDEKKM 

LKCVVVGDGAVGKTCLLMSYANDAFPEEYVPT 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFLICFSVVNPASYHNVQEEWVPEL 

KDCMPHVPYVLIGTQIDLRDDPKTLARLLYMKE 

KPLTYEHGVKLAKAIGAQCYLECSALTQKGLKA 

VFDEAILTIFHPKKKKKRCSEGHSCCSn 


3229. 


A 


25 


722 


AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVDVLGLEEESLGSVPAPACALLLLFPLTAQ 

HENFRKKQffiELKGQEVSPKVYFMKQTIGNSCGT 

IGLIHAVANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEAIQAAHDAVAQEGQCRVDDKV 

NFHFILFNNVDGHLYELDGRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRFSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSGVQHHPPEPKAQTEGNEDSE 

GKEQRWEMVMDKKHFKLWRRPITGTHLYQYRV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALVIKLE 

VmRDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYLLTYSDNPQTVFPR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

IICUIIUC 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
£=GIutamic Acid, F=Phenyialanine, G'^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X==Unknown, *'=Stop codon, /-possible nucleotide deletion^ 
\=possible nucleotide insertion 










YCVSWMVSSGMPDFLEKLHMATLKAKNNIEIKV 
KDYISAKPLEMSSEAKATSQSSERK>JEGSCGPAR 
DBYA 


3231 


A 


2117 


590 


FVPEPPEAGASSPCAPGDPDMSFRKWRQSKFRH 

WGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKF 

LAVIVEASGGGAFLVLPLSKTGRIDKAYPTVCGH 

TGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPE 

NGLTSPLTEPVWLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN 

VSWNHNGSLFCSACKDKSVRIIDPRRGTLVAERE 

KAHEGARPMRAIFLADGKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQR 

GMGSMPKRGLEVSKCEIARFYKLHERKCEPIVM 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

DADPILISLREAYVPSKQRDLKISRRNVLSDSRPA 

MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 

KLEEVMQELRALRALVKEQGDRICRLEEQLGRM 

ENGDA 


3232 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATDLPSPTSETVLTVAAFGVISFIVILVVVVI 
ILVGVVSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3233 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVWVI 
ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3234 


A 


1169 


4292 


AGDCGRLGVGGSEFPWEGSALGASPLPPICLQSR 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

KHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 

RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 

EEEEEKASNIVMLRMLPQAATEDDIRGQLQSHG 

VQAREVRLMRNKSSGQSRGFAFVEFSHLQDATR 

WI^ANQHSLNILGQKVSMHYSDPICPKINEDWL 

CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 

RLDQQTLPLGGRELSQGLLPLPQPYQAQGVLAS 

QALSQGSEPSSENANDTIILRNLNPHSTMDSILGA 

LAPYAVLSSSNVRVIKDKQTQLNRGFAFIQLSTIE 

AAQLLQILQALHPPLTIDGKTINVEFAKGSKRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSEEPPVDYSYYQQDEGYGNSQGTESSLYA 

HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 

VSMQAFSRPQPGAAPGF^QQSAEASSSQGTAANS 

QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 

NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locstion 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G^'Glycine, H-Histidine, 
I=Isoleucine, K=Lysioe, lr=Leucine, M^^Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, y=Tyrosine, 
X=Unknown, *»Stop codon, A^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










GHKETGAPSKEGKEKKEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAILEKKGALAERQHTSMDLPKLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSENELEALEKNDMEQMKYRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDNIGSRMLQAMGWKEGSGLGRKKQGIVTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG 

PDAYTDFFLPLLSRCPSAMGIKNKDGETPGQILG 

WGPPWDSAEEEEEDDASKEREWRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

KPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQIETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 


A 


3 


1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQBLPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELDCSFTAHEKWQFHWRNMHAPGMKKIKLD 

TPEEIARWREERRKNYPTLANIERKKKLKLEKEK 

RGAVLTTTQYGKMKGMSRHSQMAKIRSPGKNH 

KWKIsrDNSRQRAVTGSGSHLCDLKLEGPPEANA 

DPLGVLINSDSESDKEEKPQHSVIPKEVTPALCSL 

MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 

APKSPSQDVKATVRNFSEAKSENRKKSFEKTNPK 

REKRLSQLSNVIRTKNTPSISLGNASSSGHST 


3237 


A 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRVVSRKKMSLKSERRGIHVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KKTNEKTRKVTTVKKFFSASSRVGSKKEIQEAKA 

PSPSINRQTSIETDRVSKEFIEFLKTFHKTGQEIYK 

QTKLFLEGMHYKRDLSffiEQSECAQDFYHNVAE 

RMQTRGKVPPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDLAIQmRALRWVTPQMLCVPV 

NEDIPEVSDlvrm:AITDIIEMDSKRVPRDKLACIT 

KCSKHIFNAIKITKNEPASADDFLPTLIYIVLKGNP 

PRLQSNIQYITRFCNPSRLMTGEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAKKLEKDLIDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


449 


VLSVCPTGWRPAPCRMAFMKKYLLPILGLFMA 
YYYYSANEEFRPEMLQGKKVIVTGASKGIGREM 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino - 

acid residue of ' 

peptide 

sequence 


Predicted end 
nueieotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (AoAlanine OCysteine,'D=Aspartic Acid, 
E°Glutamic Acid, F=Pbenylalanine, GaGlycincH°Histidine, 
l-boleucine, K°Lysine, L^Leucine, M^Metiiionine, 
N~Asparagine,P-Prolitte, Q=Giutamine, R=Arginine, S=SeriDe, 
T'Tiireonine, V-Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unl(nown, *><top codon, /=possible nucleotide deletion, 
>ppossible nucleotide insertion 










AYHLAKMGAHVVVTARSKETLQKVVSHCLELG 
AASAHYIAGTMEDMTFAEQFVAQAGKLMGGLD 
NCIL>miTNTSL>nJTIDDIHHVRKSMEVNFLSYV 
VLTVAALPMLKQSNGSIVWSSLAGKVAYPMVA 
AYSASKFALDGFFSSIRKEYSVSRVNVSITLCVLG 
LIDTETAMKAVSGIVHMQAAPKEECALEnKGGA 
LRQEEVYYDSSLWTTLLIKNPCRKILEFLYSTSYN 
MDRFINK 


3239 


A 


213 


422 


ERTMQLEIKVALNFI1FYLYNKLLW/QPLKKK*EA 
HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 


3240 


A 


1255 


1425 


HESYHVNPNLCNPVAPTSGAHSIG*KWPSWLGA 
VAHSCNPSTLVGRGGRITRGQELR 


3241 


A 


161 


547 


PAGIGRSTAKTPGTPGSLEMENLKSGVYPLKEAS 
GCPGADRNLLVYSFYEKGPLTFRDVAIEFSLEEW 
QCLDTAQQDLYRKVMLENYRNLVFLAGIAVSKP 

DLITCLEQGKEPWNMKRHAMVDQPPGR 


3242 


A 


50 


241 


PLPARGKSTLPATFCSPSAPELASMSVVPPNRSQT 
GWPRGVTQFGNKYIQQTKPLTLERTINL 


3243 


A 


380 


702 


FVAYLKLPFFSQVCLFASSEMFFTISRKNMSQKLS 
LLLLVFGLIWGLMLLHYTFQQPRHQSSVKLREQI 
LDLSKRYVKALAEENKNTVDVENGASMAGYGK 
ITVEYF 


3244 


A 


37 


.1391 


VLMDGRMMRSMRLREEESPGPSHTASCLCGSAP 

CILCSCCPASRNSTVSRLIFTFFLFLGVLVSIIMLSP 

GVESQLYKLPWVCEEGAGIPTVLQGfflDCGSLLG 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AIQNGFWFFKFLILVGLTVGAFYIPDGSFTNIWFY 

FGVVGSFLFILIQLVLLIDFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSIAAVALMFMYYT 

EPSGCHEGKVFISLNLTFCVCVSIAAVLPKVQDA 

QPNSGLLQASVITLYTMFVTWSALSSIPEQKCNP 

HLPTQLGNETVVAGPEGYETQWWDAPSrVGLIIF 

LLCTLFISLRSSDHRQVNSLMQTEECPPMLDATQ 

QQQQVAACEGRAFDNEQDGVTYSYSFFHFCLVL 

ASLHVMMTLTNWYKPGETRKMISTWTAVWVKI 

CASWAGLLLYL 


3245 


A 


52 


426 


SSLGNEDDEILSLAKDITGMFVASHRKMRAHQV 
LTFLLLFVITSVASENASTSRGCGLDLLPQYVSLC 
DLDAIWGIVVEAAAGAGALITLLLMLILLVRLPF 
FKEKEKKSPVGLHFLFLLGTLGP 


3246 


A 


3 


515 


HEVCGSGCCCHCCAGGPVARQKALPRLRGVMS 

RFLNVLRSWLVMVSIIAMGNTLQSFRDHTFLYEK 

LYTGKPNLVNGLQARTFGIWTLLSSVIRCLCAIDI 

HNKTLYHITLWTFLLALGHFLSELFVYGTAAPTI 

GVLAPLMVASFSILGMLVGLRYLEVEPVSRQKK 

RN 


3247 


A 


1 


932 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEENSV 

THHEVKCQGKPLAGIYRKREEKRNAGNAVRSA 

MKSEEQKIKDARKGPLVPFPNQKSEAAEPPKTPP 

SSCDSTNAAIAKQALKKPIKGKQAPRKKAQGKT 

QQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELI 

ESGKEEGMKIDLIDGKGRGVIATKQFSRGDFWE 

YHGDLIEITDAKKREALYAQDPSTGCYMYYFQY 

LSKTYCVDATRETNRLGRLINHSKCGNCQTKLH 



293 



wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
resioiic oi 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

pepiiue 

sequence 


Amino acid sequence (A^AIanine C^Cysteine, D=Aspartic Add, 
EMSIutamic Acid, F=PhenyIalanine, G=Glycine, H»Histidine, 
I-Isoleudne, K^Lysine, L»Leucine, IVI=Methionine, 
N^Asparaginc, P«Proline, Q=Glutamine, R»Arginine, S^Serinc, 
T»Tlireonine, V=Valine, W=Tryptoplian, Y^Tyrosine, 
X-Unknown, *=Stop codon, /^possible nudeotide deletion, 

V=nneeihl» niM^lAAHHf* inci)>i*fiAn 










DIDGVPHLTLIASRDIAAGEELLYDYGDRSKASIE 
AHPWLKH 


3248 


A 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCLATCTRDGKPSARMLLLKGFGKD 

GFRFFTNFESRKGKELDSNPFASLVFYWEPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAWSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTNRLHDRIVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 


A 


43 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRKLFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 

HLTVKKIFVGGDCEDTEEYNLRDYFEKYGKIETIE 

VMEDRQSGKKRGFAFVTFDDHDTVDKIVVQKY 

HTINGHNCEVKKALSKQEMQSAGSQRGRGGGS 

GNFMGRGGNFGGGGGNFGRGGNFGGRGGYGG 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


3250 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLICRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIWSNCVINL 

VPDKQQVLQEAYRVLBMGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3251 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKfflREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIWSNCVINL 

YPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3252 


A 


1 


574 


PLGSNTAPALRVMVQAWYMDDAPGDPRQPHRP 

DPGRPVGLEQLRRLGVLYWKLDADKYENDPELE 

KIRRERNYS WMDHTICKDKLPNYEEKIKMFYEE 

HLHLDDEIRYILDGSGYFDVRDKEDQWIRIFMEK 

GDMVTLPAGIYHRFJVDEKNYTKAMRLFVGEPV 

WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVTLLGLAVGSYLVRRSRRPQVTLLDPNE 
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SEQD) 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to Hrst amiiio 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=»Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=^erine, 
T=ThreonIne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *^top codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










KYLLRLLDKTWSHNTKRFRFALPTAHHTLGLPV 

GKfflYLSTRIDGSLVIRPYTPVTSDEDQGYVDLVI 

KVYLKGVHPKFPEGGKMSQYLDSLKVGDWEF 

RGPSGLLTYTGKGHFNIQPNKKSPPEPRVAKKLG 

MIAGGTGITPMLQLIRAILKVPEDPTQCFLLFANQ 

TEKDnLREDLEELQARYPNRFKLWFTLDHFPKD 

WAYSKGFVfADMIREHLPAPGDDVLVLLCGPPP 

MVQLACHPNLDKLGYSQKMRFTY 


3254 


A 


1 


968 


LQSAGEGVTHVLILLESPARPVAAVTQVQRRRY 

HRLSDMSMLAERRRKQKWAVDPQNTAWSNDD 

SKFGQRMLEKMGWSKGKGLGAQEQGATDHIKV 

QVKNNHLGLGATINNEDNWIAHQDDFNQLLAEL 

NTCHGQETTDSSDKKEKKSFSLEEKSKISKNRVH 

YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 

DASPSTPEENETTTTSAFTIQEYFAKRMAALKNK 

PQVPWGSDISETQVERKRGKKRNKEATGKDVE 

SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 

EQLRGPCWDQSSKASAQDAGDHVQPA 


3255 


A 


173 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGICR 

MAFNGCCPDCKVPGDDCPLVWGQCSHCFHMHC 

ILKWLHAQQVQQHCPMCRQEWKFKE 


3256 


A 


2 


377 . 


TAARRRQKGTAARRRQKGTLEEVVLPPRSCRVF 
WIHSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSAIITNDGIGINPAQTA 
GNVFLKHGSELRIIPRDRVGSC 


3257 


A 


3 


1454 


GCSAAAAGAGSGPWAAQEKQFPPALLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEKIRNVGLCEAI 

VQFTRTFSPSKPAKSLHTQKNRQFFNEPEENFWM 

VMVVRNPIIEKQSKDGKPVIEYQEEELLDKVYSS 

VLRQCYSMYKLFNGTFLKAMEDGGVKLLKERL 

EKFFHRYLQTLHLQSCDLLDIFGGISFFPLDKMTY 

LKIQSFINRMEESLNIVKYTAFLYNDQLIWSGLEQ 

DDMRILYKYLTTSLFPRHIEPELAGRDSPIRAEMP 

GNLQHYGRFLTGPLNLNDPDAKCRFPKIFVNTD 

DTYEELHLIVYKAMSAAVCFMIDASVHPTLDFC 

RRLDSIVGPQLTVLASDICEQFNINKRMSGSEKEP 

QFKFIYFNHMNLAEKSTVHMRKTPSVSLTSVHPD 

LMKILGDINSDFTRVDEDEEIIVKAMSDYWVVG 

KKSDRRELYVILNQKNANLIEVNEEVKKLCATQF 

NNIFFLD 


3258 


A 


113 


1558 


APRGCSMPHRKKKPFmKKKAVSFHLVHRSQRD 

PLAADESAPQRVLLPTQKIDNEERRAEQRKYGVF 

FDDDYDYLQHLKEPSGPSELIPSSTFSAHNRREEK 

EETLVIPSTGIKLPSSVFASEFEEDVGLLNKAAPV 

SGPRLDFDPDIVAALDDDFDFDDPDNLLEDDFIL 

QANKATGEEEGMDIQKSENEDDSEWEDVDDEK 

GDSNDDYDSAGLLSDEDCMSVPGKTHRAIADHL 

FWSEETKSRFTEYSMTSSVMRRNEQLTLHDERFE 

KFYEQYDDDEIGALDNAELEGSIQVDSNRLQEVL 

NDYYKEKAENCVKLNTLEPLEDQDLPMNELDES 

EEEEMITVVLEEAKEKWDCESICSTYSNLYNHPQ 

LIKYQPKPKQIRISSKTGIPLNVLPKKGLTAKQTE 

RIQMINGSDLPKVSTQPRSKNESKEDKRARKQAI 

KEERKERRVEKKANKLAFKLEKRRQEKELLNLK 

KNVEGLKL 
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SEQlD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=A$partic Acid, 
£=Glutamic Acid, F=Phenylalanine, G^GIycine, H^Histidine, 
I=l50leucine, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S==Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *'=Stop codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLIILATISDSHLHTPMYFFLSNLSFA 

DICVTSTTIPKMLMNIQTQNKVITYIACLMQMYF 

FILFAGFENFLLSVMAYDRFVAICHPLHYMVIMN 

PHLCGLLVLASWTMSALYSLLQILMVVRLSFCT 

ALEIPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSYSKIISSIHAISSAQGKYKAFSTC 

ASHLSWSLFYGAILGVYLSSAATRNSHSSATAS 

VMYTWTPML>«>FIYSLRNKDIKRALGIHLLWGT 

MKGQFFKKCP 


3260 


A 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGBLSPSELRKIFSNLE 

DILQLHIGLNEQMKAVRKRNETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIIPTQMQRL 

TKYPLLLDMATYTEWPTEREKVKKAADHCRQIL 

NYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEY 

PNVEELRNLDLTKRKMIHEGPLXrvv^WRDKTro 

LYTLLLEDELVLLQKQDDRLVLRCHSKILASTAD 

SKHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASNILVMDHMIMTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVNL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GIPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQIMEYIHKIEA 

DLEHLKKVEESYTILCQRLAGSALTDKHSDKS 


3261 


A 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

GQQPTAPDKSKETNKTDNTEAPVTKIELLPSYST 

ATLIDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

ILCFFQGIGRLILLLGFLYFFVCSLDILSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS 

STSTSIVVSMVSSSLLTVRAAIPIIMGANIGTSITNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

LLPVEVATHYLEnTQLIVESFHFKNGEDAPDLLK 

VITKPFTKLIVQLDKKVISQIAMNDEKAKNKSLV 

KIWCKTFTNKTQINVTVPSTANCTSPSLCWTDGI 

QNWTMKNVTYKENIAKCQHIFVNFHLPDLAVGT 

ILLILSLLVLCGCLIMIVKILGSVLKGQVATVIKKT 

INTDFPFPFAWLTGYLAILVGAGMTFIVQSSSVFT 

SALTPLIGIGVITffiRAYPLTLGSNlGTTTTAILAAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGNISAKYRWFAVFYLIIFFFLIPLTVFG 

LSLAGWRVLVGVGVPVVFIIILVLCLRLLQSRCPR 

\^PKKLQNWNFLPLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D^Aspartic Acid, 
E==Glutamic Acid, F^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, Lr=Leodne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^erine, 
T=Threoninc, V==Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, '^=Stop codon, A=possible nucleotide deletion, 
\»possible nucleotide insertion 










SDSKTECTAL 


3262 


A 


30 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GEIHNWTELLDLFNHTLSECHVELSQSTKRVVLF 

ALYLAMFVVGLVENLLVICVNWRGSGRAGLMN 

LYILNMAIADLGIVI^LPVWMLEVTLDYTWLWG 

SFSCRFTHYFYFVNMYSSIFFLVCLSVDRYVTLTS 

ASPSWQRYQHRVRRAMCAGIWVLSAIIPLPEVV 

HIQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPGQPKSRRHCLL 

LCAYVAVFVMCWLPYHVTLLLLTLHGTfflSLHC 

HLVHLLYFFYDVIDCFSMLHC VINPILYNFLSPHF 

RGRLLNAVVHYLPKDQTKAGTCASSSSCSTQHSI 

IITKGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TQPLTPS 


3263 


A 


1 


919 


QARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDARREFEERHIPG 

AAFFDIDQCSDRTSPYDHMLPGAEHFAEYAGRL 

GVGAAIHWIYDASDQGLYSAPRVWWMFRAFG 

HHAVSLLDGGLRHWLRQNLPLSSGKSQPAPAEF 

RAQLDPAFIKTYEDIKENLESRRFQWDSRATGR 

FRGTEPEPRDGIEPGHIPGTVNIPFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSGVTACHVA 

LGAYLCGKPDVPIYDGSWVEWYMRARPEDVISE 

GRGKTH 


3264 


A 


1 


1398 


ARRSTPRTAPRASATRSAAGTMREIVHIQAGQCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERJ 

NVYYNEAAGNKYVPRAELVDLEPGTMDSVRSGP 

FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

DSVLDVVRKESESCDCLQGFQLTHSLGGGTGSG 

MGTLLISKIREEYPDRIMNTFSVMPSPKVSDTVVE 

PYNATLSVHQLVENTDETYSIDNEALYDICFRTL 

KLTTPTYGDLNHLVSATMSGVTTCLRFPGQLNA 

DLRKLAVNMVPFPRLHFFMPGFAPLTSRGSQQY 

RALTVPELTQQMFDSKNMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVEWIP 

NNVKTAVCDIPPRGLKMSATFIGNSTAIQELFKRI 

SEQFTAMFRRKAFLHWYTGEGMDEMEFTEAES 

NMNDLVSEYQQYQDATADEQGEFEEEEGEDEA 


3265 


A 


265 


862 


WWEDARVLGPFHPEEEGHWVMTPSEGARAGTG 

RELEMLDSLLALGGLVLLRDSVEWEGRSLLKAL 
VKKSALCGEQVHILGCEVSEEEFREGFDSDINNR 
LVYHDFFRDPLNWSKTEEAFPGGPLGALRAMCK 
RTDPVPVTIALDSLSWLLLRLPCTTLCQVLHAVS 
HQDSCPGETPPSLFPLIHLPLPRSVPLFLSTLE 


3266 


A 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQNDLM 

GTAEDFADQFLRVTKQYLPHVARLCLISTFLEDG 

IRMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 

LGQLTGCVLVLSRNFVQYACFGLFGIIALQTIAYS 

ILWDLKFLMRNLALGGGLLLLLAESRSEGKSMF 

AGVPTMRESSPKQYMQLGGRVLLVLMFMTLLH 

FDASFFSIVQNWGTAL]ym.VAIGFKTKLAALTLV 

VWLFAINVYFNAFWTIPVYKPMHDFLKYDFFQT 

MSVIGGLLLVVALGPGGVSMDEKKKEW 


3267 


A 


802 


1011 


ASTFCSAWKRRSTAALWWSGSRASRSHPRELGP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

aciu rcsiuuc oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

sequence 


Amino acid sequence (A^Alanine OCysteine, I>=Aspartic Acid, 
£=Glutamic Acid, F^Phenylalanine, G^Glycine, H^Histidine, 
I-Isoleuclne, K^Lysine^ L=Leucine, M-Methiooine, 
N»Asparagine, P^'Proline, QMSlutamlne, R^Arginine, S==Seriiie, 
T=Threonine, V«Valine, W=Tryptophan, Y»Tyrosine, 
X=Unknowa, *=^top codon, /-possible nucleotide deletion, 
Wnn«cihlp ntirlmtide incertioo 










LCFVFGTAALSmSMDVLSLFLEHGKLVFASGLSP 
RA 


3268 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQVVGV 
KTTKASNTREKLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 

NrrmAGPLHPYWPQHLRLDNFVPNDRPTWHILA 

GLFSVTGVLVVTTWLLSGRAAVVPLGTWRRLSL 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

SLWVVIAFLRQHPLRFILQLVVSVGQIYGDVLYF 

LTEHRDGFQHGELGHPLYFWFYFVFMNALWLV 

LPGVLVLDAVKHLTHAQSTLDAKATBCAKSKKN 


3270 


A 


17 


229 


GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSNF 
KYLTKYSRKQVSDEIKKSRRTVESNPIFFKKNKKI 

Q 


3271 


A 


419 


553 


IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 

GRPCSPSSAG 


3272 


A 


1211 


1450 


FQFIQIELLNILQSLIRNQTQSPYNTTAYPAIDSVIT 
ILPFSFSCFFIITKCFGLSIFPSVIFFLHVYFILTLVVF 

YCC 


3273 


A 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 

LPHVPGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTIFCRWTQGFVFSESEGSALEQFEG 

GPCAVIAPVQAFLLKKLLFSSEKSSWRDCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASISGSPAESSCQVEHSSALAVEELGFERFHA 

LIQKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGIENIKNEIEDASEPLIDPVYGHGSQS 

LINLLLTGHAVSNVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEALRYCKVGSYLKISKIPYLDCLASE 

THLTVFFAKDMALVAPEAPSEQARRVFQTYDPE 

DNGFIPDSLLEDVMKALDLVSDPEYINLMKNKL 

DPEGLGIILLGPFLQEFFPDQGSSGPESFTVYHYN 

GLKQSNYNEKVMYVEGTAVVMGFEDPMLQTD 

DTPIKRCLQTKWPYIELLWTTDRSPSLN 


3274 


A 


186 


1358 


RVVHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPVVRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLVLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPIVPL 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRVVRT 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 

PWLLAGVVDVTSLSLLSDRKGLTRRERRELRRR 

TILLLYYLLRSPFYDRFSEARILFLLQLLADHVPG 

VGLVTRPLMDYLPTWQKIYFYSWG 


3275 


A 


575 


759 


SVYSASSCKCCNYRKTEQIPDCEQPPASSMPBRPS 
HESQPTPQMMPLSAPSRAEELGQRPG 


3276 


A 


7 


258 


KAAGHRLLLAAGHPSMPSSDCLLWEGSLELRPL 

QHISSLLVLVSTTCLFAFPRVPIAFESKSCLIYHCH 

CAFTVRHYMCSSHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVPIDQVLPTPNASSRVIVHVDLDCFYAQVE 

MISNPELKDKPLGVQQKYLVVTCNYEARKLGVK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of ' 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£<=Glutamic Acid» F«Phenylalanine, GMjIycine, H^Histidine, 
I=Isolcucine, K^Lysine, L»Leucine» M-Methionine, 
N»Asparagine, P«4h^line, Q«Glutamine, R°Arginlne, &°Serine, 
T^Tbreonine, V=Valinc W=Tryptoplian, Y=Tyrosine, 
X^'Unlcnown, *=Stop codon, A=^ossible nucleotide deletion, 
V^possible nucleotide Insertion 










KLMNVRDAKEKCPQLVLVNGEDLTRYREMSYK 

VTELLEEFSPVVERLGFDENFVDLTEMVEKRLQQ 

LQSDELSAVTVSGHVYNNQSINLLDVLHIRLLVG 

SQIAAEMREAMYNQLGLTGCAGVASNKLLAKL 

VSGVFKPNQQTVLLPESCQHLIHSLNHIKEIPGIG 

YKTAKCLEALGINSVRDLQTFSPKILEKELGISVA 

QRIQKLSFGEDNSPVILSGPPQSFSEEDSFKKCSSE 

VEAKNKIEELLASLLNRLCQDERKPHTVRLIIRRY 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMTPM 

VDILMKLFRNMV>rVKMPFHLTLLSVCFCNLKAL 

NTAKKGLIDYYLMPSLSTTSRSGKHSFKMKDTH 

MEDFPKDKETNRDFLPSGRJESTRTRESPLDTTNF 

SKEKDINEFPLCSLPEGVDQEVFKQLPVDIQEEIL 

SGKSREKFQGKGSVSCPLHASRGVLSFFSKKQM 

QDIPINPRDHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKDERISQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSRNHTTDSHKQT 

VATDSHEGLTENREPDSVDEKITFPSDIDPQVFYE 

LPEAVQKELLAEWKRTGSDFHIGHK 


3278 


A 


1 


876 


GLRLHVDLVEKPRTGIMAAETRNVAGAEAPPPQ 

KRYYRQRAHSNPMADHTLRYPVKPEEMDWSEL 

YPEFFAPLTQNQSHDDPKDKKEKRAQAQVEFAD 

IGCGYGGLLVELSPLFPDTLILGLEIRVKVSDYVQ 

DRIRALRAAPAGGFQNIACLRSNAMKHLPNFFY 

KGQLTKMFFLFPDPHFKRTKHKWRIISPTLLAEY 

AYVLRVGGLVYTITDVLELHDWMCTHFEEHPLF 

ERVPLEDLSEDPVVGHLGTSTEEGKKVLRNGGK 

NFPAIFRRIQDPVLQAVTSQTSLPGH 


3279 


A 


82 


2929 


TRTKRRLGREKAMASPPRGWGCGELLLPFMLLG 

TLCEPGSGQIRYSMPEELDKGSFVGNIAKDLGLE 

PQELAERGVRIVSRGRTQLFALNPRSGSLVTAGRI 

DREELCAQSPLCVVNFNILVENKMKIYGVEVEn 

DINDNFPRFRDEELKVKVNENAAAGTRLVLPFA 

RDADVGVNSLRSYQLSSNLHFSLDWSGTDGQK 

YPELVLEQPLDREKETVHDLLLTALDGGDPVLSG 

TTHIRVTVLDANDNAPLFTPSEYSVSVPENIPVGT 

RLLMLTATDPDEGINGKLTYSFRNEEEKISETFQL 

DSNLGEISTLQSLDYEESRFYLMEWAQDGGAL 

VASAKVWTVQDVNDNAPEVILTSLTSSISEDCL 

PGTVIALFSVHDGDSGENGEIACSIPRNLPFKLEK 

SVDNYYHLLTTRDLDREETSDYMTLTVMDHGT 

PPLSTESHIPLKVADVNDNPPNFPQASYSTSVTEN 

NPRGVSIFSVTAHDPDSGDNARVTYSLAEDTFQG 

APLSSYVSINSDTGVLYALRSFDYEQLRDLQLWV 

TASDSGNPPLSSNVSLSLFVLDQNDNTPEILYPAL 

PTDGSTGVELAPRSAEPGYLVTKVVAVDKDSGQ 

NAWLSYRLLKASEPGLFAVGLHTGEVRTARALL 

DRDALKQSLVVAVEDHGQPPLSATFTVTVAVAD 

RIPDILADLGSIKTPIDPEDLDLTLYLVVAVAAVS 

CVFLAFVIVLLVLRLRRWHKSRLLQAEGSRLAG 

VPASHFVGVDGVRAFLQTYSHEVSLTADSRKSH 

LIFPQPNYADTLLSEESCEKSEPLLMSDKVDANK 

EERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDT 

GTWPNNQFDTEMLQAMILASASEAADGSS'ILGG 

GAGTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

flcid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

pcpiluc 

sequence 


Amino acid sequence (A='Alanine C=Cysteine, D^Aspartic Acid, 
E^Iutamic Acid, F=Phenylalanine, G^GIycine, H"Histidine, 
I^Isoleucine, Kr=Lysine, L^Leucine, M=Methionine, 
N»Asparagine, P«Proline, Q=Glutamine, R^Argininc, S=Scrinc, 
T^Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^ssible nucleotide deletion, 

\=n/teoSlilA nii^lonM^A ■tiCAFtmn 

i^pusSiuic nuvicuiiuc iiiscr uuu 










VYIPGSNATLTNAAGKRDGKAPAGGNGNKKKS 
GKKEKK 


3280 


A 


149 


1288 


GTSQMSSHKGSWAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHHAMEKMEEFVYKVWEGRWRVl 

PYDVLPDWLKDNDYLLHGHRPPMPSFRACFKSIF 

mXETGNIWTHLLGFVlJFLFLGILTMLRPNMYF 

MAPLQEKWFGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFYCS 

PQPRLIYLSIVCVLGISAirVAQWDRFATPKHRQT 

RAGVFLGLGLSGWPTMHFTIAEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQIFHVLVVAAAFVHFYGVSNLQEFRYGL 

EGGCTDDTLL 


3281 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEKLA 

kLQAQVRIGGKGTARRKKKWHRTATADDKKL 

QSSLKKLAVNNIAGIEEVNMIKDDGTVIHFNNPK 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDPV 

PDLVENFDEASKNEAN 


3282 


A 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AVVPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIH 

lETKYEDNKGSNDTIFDNEAKDVEREVCFIDIACD 

EIPERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKWR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHEQTNIKVCNQHSSPVDDIESHAQTST 


3283 


A 


159 


547 


IKSKLNQQVEVQESEWRLTEAKGPTMGKESGW 
DSGRAAVAAWGGWAVGTVLVALSAMGFTSV 

GIAASSIAAKMMSTAAIANGGGVAAGSLVAILQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


3284 


A 


227 


637 


TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDTVTCIRCGFNINVRDFEGKVVKTS 

WFHQLGTAMPMSVEEGPECQGPVVDRRCPRCG 

HEGMAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 


3285 


A 


123 


1535 


HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVPIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTRKDNYNAEREFLQGATnEAC 

DGSDDIFGLSTDSLSRLRSPSVLEVREKGYERLKE 

ELAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASLFEEAHKMVREANIKQATAEKQLKEAQGKI 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

KKGHTRNKSTSSAMSGSHQDLSVIQPrVKDCKEA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDffP 

CLTFSKSELASAVLEAVENNTLSffiPVGLQPIRFV 

KASAVECGGPKKCALTGQSKSCKHRIKLGDSSN 

YYYISPFCRYRITSVCNFFTYIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQLYPEVPPEEFRPFLAKMRGILKSIAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alonine C=Cysteine, D^Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G=<9lycine» H^Histidlne, 
I=Isoleucine, K»Lysine, L^Leucine, M^ethionine, 
N-Asparagine, P^Proline, Q^GIutamine, R=Arginine, &=Serine, 
T»Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown» *"Stop codon»/»po$sible nucleotide deletion* 
\=possible nucleotide insertion 










ADMDFNQLEAFLTAQTKKQGGITSDQAAVISKF 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
SQSRHSAQIHTPVAIIELELGKYGQESEFLCLEFD 
EVKVNQILKTLSEVEESISTLISQPN 


3287 


A 


50 


390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDGKC 
VICDSYVRPCTLVRICDECNYGSYQGRCVICGGP 
GVSDAYYCKECTIQEKDRDGCPKIVNLGSSKTDL 
FYERKKYGFKKR 


3288 


A 


3 


428 


RTTFFRFRPCESLCGDMKLLTHNLLSSHVRGVGS 
RGFPLRLQATEVRICPVEFNPNFVARMIPKVEWS 
AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 
HLLLEVEVIEGTLQCPESGRMFPISRGIPNMLLSE 

EETES 


3289 


A 


1 


1743 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRTI 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTQ 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNIEDEYKNPRRNLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTHIRADTGHKSSEYQEYGENPYRNKECKK 

AFSYLDSFQSHDKACTKEKPYDGKECTETFISHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCKQCGKAFTRSTTLPVHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

SDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQI 

HERTHSGEKPHECKECGKVFKYFSSLRIHERTHT 

GEKPHECKQCGKAFRYFSSLHIHERTHTGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 

AFISNYIRYHERTHTGEKPYQCKQCGKAFIRASS 

CREHERTHTINR 


3290 


A 


2 


1350 


GRPRSSSDNROTLRERAGLSSAAVQTRIGNSAAS 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

WAVAVVWVVSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPQE 

VTLQLFTDGITNKLIGCYVGNTMEDVVLVRIYGN 

KTELLVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWIPKSNLWLKMGKYFSLIPTGFADEDIN 

KRFLSDIPSSQn.QEEMTWMKEILSNLGSPVVLCH 

NDLLCKNUYNEKQGDVQFIDYEYSGYNYLAYDl 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVEILFIQVNQFALASHFF 

WGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMK 

PEVTALKVPE 


3291 


A 


102 


839 


PEAQTSA\a.AREKGHLPTMRHEAPMQMASAQD 

ARYGQKDSSDQNFDYMFKLLnGNSSVGKTSFLF 

RYADDSFTSAFVSTVGIDFKVKTVFKNEKRIKLQI 

WDTAGQERYRTITTAYYRGAMGFILMYDirNEE 

SFNAVQDWSTQIKTYSWDNAQVILVGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNINVKQ 

TFERLVDIICDKMSESLETDPAITAAKQNTRLKET 

PPPPQPNCAC 


3292 


A 


2 


4136 


DRPPWNSRVDDFVTNLIHLSSKGHISPAKDTSLQ 
QRTPAEMSPVLHFYVRPSGHEGAASGHTRRKLQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C'^Cysteine, D=Aspartic Acid, 
£=GIutamic Acid, F=Phenylalanine, G^GIycine, H«Histidine, 
I^'Isoleucine, K=Lysine, L=Leucine, M^Metbionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=lInknown, *>=^top codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion 










GKLPELQGVETELCYNVNWTAEALPSAEETKKL 
MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 
LNFSTPTSTNIVSVCRATGLGPVDRVETTRRYRLS 
FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 
ESMPEPLNGPINILGEGRLALEKANQELGLALDS 
WDLDFYTKRFQELQRNPSTVEAFDLAQSNSBHS 
, RHWFFKGQLHVDGQKLVHSLFESIMSTQESSNP 
NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 
QQGLRHVVFTAETHNFPTGVCPFSGATTGTGGRI 
RDVQCTGRGAHVVAGTAGYCFGNLHIPGYNLP 
WEDLSFQYPGNFARPLEVAIEASNGASDYGNKF 
GEPVLAGFARSLGLQLPDGQRREWIKPIMFSGGI 
GSMEADHISKEAPEPGMEWKVGGPVYRIGVGG 
GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 
NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 
LSDPAGAIIYTSRFQLGDPTLNALEIWGAEYQESN 
ALLLRSPNRDFLTHVSARERCPACFVGTITGDRRI 
VLVDDRECPVRRNGQGDAPPITPPTPVDLELEW 
VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 
LERVLRLPAVASKRYLTNKVDRSVGGLVAQQQC 
VGPLQTPLADVAVVALSHEELIGAATALGEQPV 
KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 
CSGNWMWAAKLPGEGAALADACEAMVAVMA 
ALGVAVDGGKDSLSMAARVGTETVRAPGSLVIS 
AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 
QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 
ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 
NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP 
DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 
VSVNGAVVLEEPVGELRALWEETSFQLDRLQAE 
PRCVAEEERGLRERMGPSYCLPPTFPKASVPREP 
GGPSPRVAILREEGSNGDREMADAFHLAGFEVW 
DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 
SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 
CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 
PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 
MEGAVLPVWSAHGEGYVAFSSPELQAQffiARGL 
APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 
DGRHLAVMPHPERAVRPWQWAWRPPPFDTLTT 
SPWLQLFINARNWTLEGSC 


3293 


A 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHYQMSVTLKYEIKKLIYVHLVIWLLLVAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSBLPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTSTQEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 

WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 

WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 

LRSGIOEAKILQHFGDGLCRMLDERLQRHRTSGG 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVILLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTHQPARYSLTPEGLELAQKLAESEGLS 

LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^Alanine C-Cysteine, D^Aspartic Add, 
E=Glutamic Acid, F»Pbenylalanine, G=Glycine, H^Histidinc, 
I-Isoleucine, K-Lysioe, L»Leucine, M-Methionine, 
N»Asparagine, P==Proline, Q=<;iutamine, R-Arginine, S^erine, 
T=Threonine, V^Vallnc, W=Tryptophan, Y^Tyrosinc, 
X^Unknown, *^top codon» /^possible nucleotide deletion, 
\~possible nucleotide insertion 










QPLELRPGEYRVLLCVDIGETRGGGHRPELLREL 

QRLHVTHTVRKLHVGDFVWVAQETNPRDPANP 

GELVLDHIVERKRLDDLCSSIIDGRFREQKFRLKR 

CGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQ 

VIDGFFVKRTADIKESAAYLALLTRGLQRLYQGH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGAIKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSTIKCG 

RLQRNLGPALSRTLSQLYCSYGPLT 


3295 


A 


2 


1115 


EFHPHTQVSGLLTPQLQEPDVWSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSWIEQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQIAKSFKAQFGKDLTETLKSELSGKFERLIVAL 

MYPPYRYEAKELHDAMKGLGTKEGVIIEILASRT 

KNQLREIMKAYEEDYGSSLEEDIQADTSGYLERI 

LVCLLQGSRDDVSSFVDPALALQDAQDLYAAGE 

KIRGTDEMKFITILCTRSATHLLRVFEEYEKIANK 

SIEDSIKSETHGSLEEAMLTVVKCTQNLHSYFAE 

RLYYAMKGAGTRDGTLIRNIVSRSEIDLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 


A 


1 


838 


GTRGGVGPGDNGGVEAGAKPGAAAIPLRGDGS 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPGRVPTTARERVPATKTVHLQS 

RARYTSEMRSELLGTDSAEPEMDVRKRTGVAGS 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

ARSLKTNTLAAQSVIKKDNQTLSHSLKMADQNL 

EKLKTESERLEQHTQKSVNWLLWAMLIIVCFIFIS 

MILFIRIMPKLK 


3297 


A 


46 


617 


HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 

TGIPGSPACRQPVVGLHSLHNYRMAMVSAMSW 

VLYLWISACAMLLCHGSLQHTFQQHHLHRPEGG 

TCEVIAAHRCCNKNRIEERSQTVKCSCLPGKVAG 

TTRNRPSCVDASIVIGKWWCEMEPCLEGEECKTL 

PDNSGWMCATGNKIKTTRIHPRT 


3298 


A 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLAD 

PLNKSSYKYEADTVDLNWCVISDMEVIELNKCT 

SGQSFEVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 

KKLEAAEERRKYQEAELLKHLAEKREHEREVIQ 

KAIEENNNFIKMAKEKLAQKMESNKENREAHLA 

AMLERLQEKDKHAEEVRKNKELKEEASR 


3299 


A 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAGVA 

GGAPRRRTPVTMWRLLARASAPLLRVPLSDSWA 

LLPASAGVKTLLPVPSFEDVSIPEKPKLRFIERAPL 

VPKVRREPHSrLSDIRGPSTEATEFTEGNFAILALG 

GGYLHWGHFEMMRLTINRSMDPKNMFAIWRVP 

APFKPITRKSVGHRMGGGKGAIDHYVTPVKAGR 

LWEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 

RGTLEKMRKDQEERERNNQNPWTFERIATANML 

GIRKVLSPYDLTHKGKYWGKFYMPKRV 


3300 


A 


2 


1847 


FVAGGPRGSGSAAETMPEIRVTPLGAGQDVGRS 
CILVSIAGKNVMLDCGMHMGFNDDRRFPDFSYI 
TQNGRLTDFLDCVnSHFHLDHCGALPYFSEMVG 
YDGPIYMTHPTQAICPILLEDYRKIAVDKKGEAN 
FFTSQMIKDCMKKVVAVHLHQTVQVDDELEIKA 



303 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locfltion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalaninc, G^GIycinc, H<=Histidinc, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^erine, 
T=Threonine, V«Valine, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
N^possible nucleotide insertion 










YYAGH\a.GAAMFQIKVGSESVVYTGDYNMTPD 

RHLGAAWIDKCRPNIXITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

ETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWT 

NQKIRKTFVQR>nvIFEFKHIKAFDRAFADNPGPM 

VVFATPGMLHAGQSLQIFMCWAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGIMQLVGQAEPESVLLVHGE 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 

PSIPVGISLGLLKREMAQGLLPEAKKPRLLHGTLI 

MKDSNFRLVSSEQALKELGLAEHQLRFTCRVHL 

HDTRKEQETALRVYSHLKSVLKDHCVQHLPDGS 

VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 

GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAFRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184 


LRRNCSALGGLFQTnSDMKGSYPVWEDFINKAG 

KLQSQLRTTVVAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSIEAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDHAKEYKKARQEI 

KKKSSDTLKLQKKAKKGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALBEERGRFCTFISMLRP 

VIEEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 

EQVILDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 

APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M 


3303 


A 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 
HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 
DLNKDGCITKEEMLDIMKSIYDMMGKYTYPALR 
EEAPREHVESFFQKMDRNKDGVVTIEEFIESCQK 
DENIMRSMQLFDNVI 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQMVLEQKTDALGKQSVNRG 
FTKDKTLSSBFNIEMVKEKTAEEIKQIWQQYFAA 
KDTVYAVIPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 


3305 


A 


2 


483 


LDACSTGPYSRSTHASADAWADAWVVWLKVV 
GMTLFLLYFPQIFNKSNDGFTTTRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
♦VSSWNESWDFCKGKGCTLAIVDNSETLKLLHDL 
HDAEKNYIALPYRSSKYMSTCNGTF 


3306 


A 


2 


872 


TLSSACLIGDAWKELTIVAGAVSNQLLVWYPAT 

ALADNKPVAPDRRISGHVGIEFSMSYLESKGLLA 

TASEDRSVRIWKGGDLRVPGGRVQNJGHCFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 

AFRGHQGRGIRAIAAHERQAWVITGGDDSGIRL 

WHLVGRGYRGLG/DLGSLLQVP* * ARYTQGCDS 

GWLLATAGSD*YRGPVSL*RRGQVLGAAARG*T 

FPVLLPAGGSSWSRGLRIVCYGQWGRSCQGCPH 

QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELIALTQAIRWGKDINVNTDSRYAFATVH 
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wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lULallUll 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G=Glycine, H»Histidinc, 
I=Isoleucine, K=Lysine, L=L€ucine, M=Methionine, 
^ssAcnnrapinc P^Proline. 0=G!liit!iniiiip Tl=Araitim» CssSprinp 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion 










VRGAICQERRLLTSAEKAIK^^K^^PSSKP^^ 

WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEl 

DFTKVKPHQAGYKYLLVLVDTFSGWTEAFATK 

NETVNMWKFLLNEUPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALNIQWKLHCAYRPQSSGQVERMNC 

TLKN'nLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEIMYGRVLPILPKLRDAQLAKJSQTNLLQ 

YLQSP 


3308 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGROYVNEVFNFSVDKLYDLLFTNSPFORDF 

MEQRRFSDnFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVAKNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3309 


A 


490 


1077 


NSPSLDFT^JDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGROYVNEVFNFSVDKLYDLLFTNSPFORDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


.3310 


A 


2 


1198 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 

*RPGL*TMAASDTERDGLAPEKTSPDRDKKKEQS 

EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 

RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

KKSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 

SRSRDRKKRTEKPRRFSRSLSRTPSPPPFRGRNTA 

iJlVkJXV.l-/XvX>JV-lvXi-/XVJl XVtVl ^X\xjXj\JX\,x x \JX X L X AX,VJXvX^ X ix. 

MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 
EL\AAAAATGGSVLNVAALLASGTQVTPQIAMA 
AQMAALQAKALAETGIAVPSYYNPAAVNPMKF 
AEQEKKRKMLWQGKKEGDKSQSAGNMGKN 


3311 


A 


177 


4 


PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHPT 
PGPAHDFPPLSAVLSGHTKT 


3312 


A 


3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPTPELSCILKS 

P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 

AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSP\S 

ASAPCRAVPLSPRRLTWPPHLQVGILIPTGRPWK 

NL 


3313 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
L\P\CTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
L\P\CTPAWVTQRDFFRKKK 


3315 


A 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAH 
TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP**T 
PRCPAALRAGAfflGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 

KSNSMLQKP'RAYVRPMDGQESMEPKLSSEHYSS 

QSHGNSMTELKPSSKAHLTKLKIPSQPLDASASG 

DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

LKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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wo 01/57190 



PCTAISOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corrcs pon a 1 ng 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, l>=Aspartic Acid, 
E>=GluUmic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N~Asparagine, P~Proline, Q=Glutaraine, R=Arginine, S^Scrlne, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=l)nknown, ''•^top codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










SDSEANEPSQSASPEPEPPPTNKWQLDNWLNKV 

NPHKVSPASSVDSNPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGRVAPKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KIESETPVDLASSMPSSRHKAATKGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSREIIETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYldETEPPKGEKKNVPEKHTREAQKQASE 

KVSNKGKRKHKNEDDNRASESKKPKTEDKNSA 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

STSKQKKTEGKTSSSSKEVKVKAPSSSSNCPPSAP 
ILDSSKPRRTXLVFDDRNYSADHYLQEAKKLKH 
NADALSDRFEKAVYYLDAWSFIECGNALEKNA 
QESKSPFPMYSETVDLI 




A 




9 


hATRDIKTAAKELLKKVKFffGSALNGMVEMMD 
RRPYWCISRQRVWGVPIPVFHHKTKDEYLINSQT 
TEHIVKLVEQHGSDIWWTLPPEQLLPKEVLSEVG 
GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 


3318 


A 


2 


512 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 

GKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSP 
LYWDCLLMRGHEITEQPMKAEVRAGSIMVKEAIF 
LFRKGHSKGiaFLLFFLPFLQVHKTFPTTDGFHW 


3319 


A 


407 


1 


SSLHRSPRPASPLPVPEAPXSFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLVQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


3320 


A 


4037 


3432 


QMSEAVAEKMLQYRRDTAGWKICREGNGVSVS 
WPP^VFFPrtNT vpnpnTvvrsTT FFVWHPVK'PAV 

GGLRVKWDENVTGFEIIQSITDTLCVSRTSTPSAA 
MKLISPRDFVDLVLVKRYEDGTISSNATHVEHPL 
CPPKPGFVRGFNHPCGCFCEPLPGEPTKTNLVTFF 
HTDLSGYLPONWDSFFPR55MTRFYANT OKAVK 


3321 


A 


37 


360 


SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 




A 


1 
i 


420 


r\l V £>JL/IVX10VJXvO I i^l 1 OJL/XjVJXN y LjXOx Olr\r^ X V IN VJ r\. 

ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMJa 
VADHKNLEVIVTNGYDKDGFVHDIQNDIHASSSL 
NGRSTVHVKPIDENLGQTGKSAVCIHQDINDDH 
VEDVT 


3323 


A 


fi 

o 


459 


DTT SI Nm.PFTI PMTP<?F*T <5FI ♦FPfiT ARAKSTP 

TKTYSNEVVTLWYRPPDILLGSTDYSTQIDMW*G 

QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP 

RGVGCIFYEMATGRPLFPGSTVEEQLHFIFRILSE 

EAWALCAVETHR 


3324 


A 


1276 


466 


PGSTHASARITIY*L*nLSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLffPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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wo 01/57190 



PCTAJSOl/04098 



S£QID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lOCallOn 

corresponding 
to first amino 
add residue of 
peptide 

aCIJUdf 


Predicted end 

nucleotide 

location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaoine OCysteine, I>=Aspartic Acid, 
E^^GIutamic Acid, F==Phenyialanine, G=Gtycine, U=Histidine, 
I=Isoleudne, K=Lysine, L^Leucine, M'^Methionine, 

Ar= A cnaraoinp P=PrnIin^ O^dliitniTiinp UsArfvininp fiafCprinf 

T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, A^ossiUe nudeotide deletion, 
V^possible nucleotide insertion 










SSSPRDRDRER*RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRHESEEGDSHRRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 


3312 


TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 

GSLRVTFASGMEIGLSSEPHILAGAVNPTLGKCNI 

SLPGEHNANLISVL**GEQGCA*NVFHISFS*AHN 

RmLSIDFDHITRTGKIYDDHRKFTLRILYDQTGR 

PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 

MEYDQSFL*SPQL*LSIICYSAFVSFQSVMLLLHS 

QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 

GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLHLGTG 

RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD 

LSDSSTLIA*LLTVFVLVPAGPLIGRQIFRFSEEGL 

VNARFDYSYKNFRVTSMQAVINETPLPIDLYRYV 

DVSGRTEQFGKFSVmYDLNQ\aTTTVMKHTKIF 

SANGQVIEVQYEILKAIAYWMTIQYDNVGRMVI 

CDIRVGVDANITRYFYEYDADGQLQTVSVNDKT 

QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 

TRLGEIQYKMDEDGFLRQRGNDIFEYNSNGLLQ 

KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 

QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 

LIAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 

EILYTPYGDIYHDTYPDFQVIIGFHGGLYDFLTKL 

VHLGQRDYDVVAGRWTTPNHHIWKQLNLLPKP 

FNLSTKLIKYGIFHFLFLILCLTDIRSWLELFGFQL 

HNVLPGFPKPELENSPSI*QMSNSMLHLLCASLS* 

TILGIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 

GGKQPRFAAVPSVFGKGIKFAIKDGIVTADnGVA 

NEDSRRLAAILNNAHYLENLHFTIEGRDTHYFIK 

LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTSV 

LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 

VLEIARQRAVAQAWTKEQRRLQEGEEGIRAWTE 

GEKQQLLSTGRVQGYDGYFVLSVEQ 


3326 


A 


290 


1041 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRA 

NLGPCRRKRLQTLMRLAAGFQYSSHKDPSLSAK 

EKHTDYHNEARGPWPGWVG*RTADGSCGRGPD 

GAHHPGPKSSS WRA SRLLPGLGGSHHLDA YVGR 

DLECGTPAPLQLEIPPQPRGHPAPIPTGQAGPRDS 

GPGASP*VETRPLTDGRR*PGVRPVGWTPAHPAG 

TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 

AVPKHRAWRTPLCSQ 


3327 


A 


1 


418 


CSECGKSFCKKSKFIIHQRTHTGEKPYECNQCGK 

SFCQKGTLTVHQRTHTGEICPYECNECGKNFYQK 
LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 
QRTHSGERPYVCHDCGKTFSQKSALNDHQKIHT 
GVKLY 


3328 


A 


1 


270 


VTRKLPIFIVDAFTARAFRGSPAADCLLENELDED 
MHQKIAREMNLSETAFIRKLHPTDNFAQRSCFGL 
IWFTPTTDLOILTSSILPSIL 


3329 


A 


45 


419 


EELSCWQIWQQIANDLTRCQDSMINNSQCHKQG 
DFPYQVGTELSIQISEDENYIVNKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
QCKKGVDPIGWISHHDGHRVHKR 


3330 


A 


64 


430 


FWRNFTGLAPAAAVATTTSSSTMRFTSISNSLTST 
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PCTAJSOl/04098 



S£QID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

rnrr^^infinflinc 

to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A'^Alanine C=Cysteine, D^Aspartic Acid, 
£=Giutamic Acid, F^'Phenylalanine, G^Glycine, H-Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=AsDarfl?ine. P=ProIine. 0=Gliit9niine. R<=Ar?inine. S=Serine. 
T=Tbreonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V^possible nucleotide insertion 










AAIGLSFTTSTTTTATFTTNTTTTITSGFTVNQNQ 
LLSRGFENLVPYTSTVSVVTTPVMTYGHLEGLIN 
EGNLELEDCRKLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIKIP 
PGTPIYECNSRCQCGPDCPNRIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKIKRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYLFDLDYESDEFTVDAARY 


3332 


A 


25 


461 


PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

KMTGLGPGPQALYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 

NNVVPRINTLILRTNQQYLNLLSTSVTADAEDFS 

TFFFLDSQDKSA 


3333 


A 


317 


54 


AWIIFLPPLTSCPLWAPGTKHKTILEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQHISSRRHEIVDPV 


3334 


A 


304 ' 


410 


AGPSLPSNLRQIFQSLPPFMDBLLLLLFFMIIFAl 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTNAIDFLMDRNNVVPRI 
NTLILRTNQQYLNLISTSVTADVEDFSTFFFLDSQ 
DKSAVIAKNMYYLTQDDESIISAATLWIIADFDK 
PSGRXLLFNALKHMITSVHSRVGIIYNPFF 


3336 


A 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLNR 

VLERLAGGATRDSAASDILLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYQGLLQEEEGAGHIIKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLOTRVAFRGSDEIFCRVYMPDHS 

YVmSRLSASVQDILGSVTEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDSYEALVPLPEEIQVSPGDTEIHRVEPEDVANH 

LTAFHWELFRCVHELEFVDYVFHGE 


3337 


A 


444 


43 


KILLCLANQFPDISFCPALPAWALLLHYSIDEAE 
CFEKACRJLACNDPGRRLIDQSFLAFESSCMTFGD 
LVNKYCQAAHKLMVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 


3338 


A 


1 


398 


FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NVAEYLKLVNNADKQQAGRIKQVFEKKNQK 


3339 


A 


1 


665 


AAAASNWGLITNIVNSIVGVSVLTMPFCFKQCGI 

VLGALLLVFCSWMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKMLVETSMIGLMLGTCIAFYV 

VIGDLGSNFFARLFGFQVGGTFRMFLLFAVSLCI 

VLPLSLQRNMMASIQSFSAMALLFYTVFMFVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACQSQVLPTYDSLDEPSV 


3340 


A 


198 


367 


LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEECREKITEMP 


3341 


A 


562 


277 


HSVIKRTPRKYLAEIVLIDDFSNKEHLKEKLDEYI 
KLWNGLVKVFR>JERREGLIQARSIGAQKAKLGQ 
VLIYLDAHCEVAVNWYAPLVAPISKDR 


3342 


A 


385 


2 


hHLTWWPLFRDVSFYIVDLIMLIIFFLDNVIMWWE 
SLLLLTAYFCYVVFMKFNVQVEKWVKQMINRN 
KWKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSASLHNSLMRNSIFQNKIHTLDPHV 


3343 


A 


1 


385 


FRVDNSEEWKDVFIISSERSFKLDSLKCGTWYKV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
currcoponQiug 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
£>Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysinc, I^Leucine, M^'Methionine, 

T=Threonine, V=Valine, W=Tryptophaii, Y-Tyrosine, 
X=UoknowD, *=Stop codon, /^possible nucleotide deletipn, 
\pp05Sible nucleotide insertioo 










KLAAKNSVGSGRISEIIEAKTHGREPSFSKDQHLF 
THINSTHARLNLQGWNNGGCPITAIVLEYRPKGT 
WAWQGLRANSSGEVFLTELREATWY 


3344 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3345 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 

QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3346 


A 


3 


1509 


AGIRHEAPPTTSNRHRRQIDRGVTHLNISGLKNIP 

RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 

KTLISGMIDEPHAIVVDPLRGTMYWSDWGNHPK 

lETAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKLSVIGSIRLNGTDPIVAADSKRGLSHP 

FSIDVFEDYIYGVTYINNRVFKIHKFGHSPLVNLT 

GGLSHASDWLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCELDQCWEHCRNGGTCAASPSGMPTCRCP 

TGFTGPKCTQQVCAGYCANNSTCTVNQGNQPQ 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACVVNK 
QSGDVTCNCTDGRVAPSCLTCVGHCSNGGSCTM 
NSKMMPECOrPPHMTGPRCEEHVFSOOOPnHTA 
SILIP 


3347 


A 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITG\SHRARPENGFENIF 


3348 


A 


1 


1171 


LSKITMPVICNEPLSFIQRLTEYM*HTYF1HRPSSL 

SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 

GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 

NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

NEAYTWTNPTCCVHNIIVGKLWIEQYGNVEIINH 

KTGDKCVLNFKPCGLFGKELHKVEGYIQDKSKK 

KLCALYGKWTECLYSVDPATFDAYKKNDKKNT 

FFKKNSKOMSTSFET DEMPVPDSESVFTTPGSVT T 

WRIAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 

IPKTDCRLRPDIRAMENGEIDQASEEKKRLEEKQ 

RAARKNRSKSEEDWKTRWFHOGPNPYNGAOD 

WIYSGSYWDRNYFNLPDIY 


3349 


A 


403 


497 


NFASSSGKYLRTQKIKCLNNKFTPFPTTEKK*SQS 
VRPP*SNRIY*ILQS*NISFS*LPN*NFASSSGKYLR 
TQKIKCLNNKFTPFPTTEKK 


3350 


A 


1 


712 


GAPAQDCICLPFPFHSSFLESDIRKPARRKIQTTNP 
DFLLLLFMSVPWSAPPFCPPAEGSRDGRPKASV 

ARPA A VHEHH55PRDCGHLPDVIRSSLGG WOPH*P 

r\l\j V X 1 L jX XX lOA XVLa/V^VJXXi^X Xy r JLA VVJVJJU VJ VJ VV XX X 

AQPENRLL*LLPVE*GHQHPTVSPVP*AGSPGGAS 
G WPGPGQA WRVRVPGPHPLCPPASPPSPVQQ* *E 
SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 
LPPGAWVSSSGQRPGLTHPLAYSHGCVPSEG 


33^1 


A 


1 

1 


428 


MAAVVAATALKGRGARNARVLRGTT AGATANK 

lYMS\r\ V V r\j^ X x^x^xvvjxvvji^xxi^i^xx. v xjiwjix^r\\jr\ ix^iixv 

ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 
GKNPMKAVGLAWAIGFPCGILLFILTKREVDKDR 
VKQMKARQNMRLSNTGEYESQRFRASSQSAPSP 
DVGSGVQT 


3352 


A 


2 


841 


RTLFRGRRRREDDRISRPHPSTAESKAPTPKFDLL 
ASNFPPLPGSSSRMPGELVLENRMSDWKGVYK 



309 



wo 01/57190 PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
correspondiog 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=»Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H'=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=MethioDine, 

Itf— A cnoi*Arnna P^I^a^rtltnA fk=I^Illta mSn» T?n A rfFiniWP fisS^rill^ 

Asparagine, r^jrroiiue, \^=\jiui<iiuinc, i%=/*rgimiic, o^^cnuc, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosinc, 
X^'Unknown, *=Stop codon, /='possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLIEDSSVQKDGLNQTTIP 

VSPPSTTKPSRASTASPCNNNINAATAVALOEPR 

KLSYAEVCQKPPKEPSSVLVQPLRELRSNVVSPT 

KNEDNGAPENSVEKPHEKPEARASKDYSGFRGN 

nPRGAAGKIREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 




A 




JO 1 


TATPTWTAPLTATPTPAHOYGPARVPNGAPRLEP 

x/\ M X X WT X T\xi^ X XX xx^ J^xxy^ X \JX JT^X^ V X X^\J*XX J.> 1 It *x 

PPGKRECRVGQYVVDLTSFEQLALPVLRNADCS 
SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 
TIILGTIPVPKGKPLALVEEIRNRKDVKVFNVTKE 
NRNHLLPDIVTCVQSSRK 


3354 


A 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEWER 

VLTFLPAKALLRVACVCRLWRECVRRVLRTHRS 

VTWISAGLAEAGHLEGHCLVRVVAEELENVRILP 

HTVLYMADSETFISLEECRGHKRARKRTSMETA 

LALEKLFPKQCQVLGIVTPGIVVTPMGSGSNRPQ 

EIEIGESGFALLFPQIEGIKIQPFHFIKDPKNLTLER 

HQLTEVGLLDNPELRVVLVFGYNCCKVGASNYL 

nnw^TF<5nM>JTTT Anonvr)>JT ^<sT t^fknpt dt 

V V O 1 POJ^lVll>JllJ_»/\VJVJv^ V J-'lNJ-zOOl-? J OJ-iJVl>IX JUl-'i 

DASGWGLSFSGHRIQSATVLLNEDVSDEKTAEA 
AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 
NFILRKCNEVKDDDLFHSYTTIMALIHLGSSK 


3355 


A 


1 


707 


GTSSGLGGDRLAAPGPSPPSFYPQGRGERAYDIY 
SRLLRERJVCVMGPIDDSVASLVIAQLLFLQSESN 
If If PTHMYrKlSPGnVVTAGT ATYDTMOYTT NPTPT 

JNJ\X IXUVl I xxH Ox VJ VJ V V 1 /iiVJJU/\l x LJ x 1V1\^ I IJ^iNx XVx X 

WCVGQAASMGSLLLAAGTPGMRHSLPNSRIMIH 
QPSGGARGQATDIAIQAEEIMKLKKQLYNIYAKH 
TKOST.OVTESAlVrERDRYMSPMEAOEFGILDKVL 
VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 


33.56 


A 


352 


338 


FNYNFCRNLHMPSFLV*PGMCGLLAKHLSFHIVG 
AFLIT/LGVAALCKFAVA*PRKKAYADFYRNYN* 
IKEFEVRKANISOSTK 


3357 


A 


1 


403 


ALGSCGGLLGTGLLKGTMSGTLWSKGffAGYKR 
RIRIQREHTAVLKIEGWYARDETEFYLRMICANV 
YKANNNTVTPVLTPDKTRVMWRKVTQAHGISI 
MVRAQFRTNLPADAIGHRIRMNIL^PSRMYTTEPS 


3358 


A 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 

VMDSERQVKDTDDIESPKRSIRDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDWL 

RGSSDGRGSDSESDLPHRKLPDVKKDDMSARRT 

SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 

AEREEYRKSWSTATSPAGLGKKALQDYGPR'nPV 

S\DDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LINNQLREEDDKWQDDLARWKSIUCRSVSQDLIK 

KEEERKKMEKLLAGEDGTSERRKSnCTYREIVQE 

KERRERELHEAYKNARSQEEAEGILQQYffiRFTlS 

EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 

YLRQQSLPPPKFTATVETTIARASVLDTSMSAGS 

GSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 

SQMFEGVARVHGSPLELKQDNGSIEINIKKPNSV 

PQELAATTEKTEPNSQEDK^JDGGKSRKGNIELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A"Alanine OCysteine, D=A5partic Acid, 
E«Glutamic Acid, F^Phenylalanine, G=Glycine, H«Histidine» 
Wsoleudne, K^Lysine, L^Lencine, M-Methlonine, 
N=Asparagin^ P=Prolinc, <^=GIutamine, R^Arginine, S»Serine, 
T-Threoninei'V=:Valine, W»Tryptophan, Y^Tyrosine, 
X»Unknown, *^top codon, A=possible nucleotide deletion, 
>Fpossible nucleotide insertion > 










KDQKKPENEMSGKVELVLSQKWKPKSPEPEAT 

LTFPFLDKMPEANQLHLPNLNSQVDSPSSEKSPV 

TTPFKFWAWDPEEERRRQEKWQQEQERLLQER 

YQ\KEQDK\LKEE\WEKAQKEVEEEERRYYEEEP* 

nVEDPWPFTVSSSSADQLSTSSSMTEGSGTMNKI 

DLGNCQDEKQDRRWKKSFQGDDSDLLLKTRES 

DRLEEKGSLTEGALAHSGNPVSKGVHEDHQLDT 

EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 

KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 

PLGKGAAMIIETLNLYFHIQCFRCGUCKGQLGDA 

VSGTDVIURNGLLNCNDCYMRSRSAGQPTTL 


3359 


A 


3 


368 


EVTASREGRGACAWECGSSRGPWGLLRGTFAPV 
RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR 
AWAGRATSM*TSSYSSEYQPQTP*ALVTLPPRSy 
YLLTHLLTLTHLHHQILFEP 


3360 


A 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAADYV 
RSKDFRDYLMSTHFWGPVANWGLPIAAITDMK\ 
KSPEIISRRMTFAL*CYSLTFVRFAHYVQ\PWNWL 
MLGCHTAVDFDQLISSMPCISHGMTASASAL 


3361 


A 


4619 


532 


LLLGRANSPPYNSVVRTLPPATLLLRRAGWESF 

WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 

AGARLGDAAGGDPASGQAARGCGARAPRGLGR 

TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 

PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 

DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 

LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE 

RKl^LHKSNSEDSSVGKGDWKKKlNJKYFWQNFR 

KNQKGIMRQTSKGEDVGYVASEITMSDEERIQL 

MMMVKEKMITIEEALARLKEYEAQHRQSAALDP 

ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 

KRLHKLVNSTRRVRKKLIRVEEMK^ 

HVFENSPVLDERSALYSGVHKKPLFFDGSPEKPP 

EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 

RGLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEM 

KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 

MGKEGDFVYKEVIKSPTASRJSLGKKVKSVKET 

MRKRMSICKYSSSVSEQDSGLDGMPGSPPPSQPD 

PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 

TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 

HTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMG 

LLNNKVGTFNFIYVDVLSED\EEKPKRPTRRRRK 

GRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLD 

TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 

DSNSDQSGSQEKLLVDSQGLSGCSPRDS*CYESS 

ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 

YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 

CDPPGC*LVLN\KNRRKPPSFPSCRSC\ETL\EGPQ 

TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPQ 

IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

KRFSEPQKLTTKKLEGSIAASGRGLSPPQCLPRNY 

DAQPPGAKHGLARTPLEGHRKGHEFEGTHHPLG 

TK£uVDAEQRMQPKIPSQPPPVPAKKSRERLANG 

LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 

CPPALAPRPLSGQALGSPPSTRPPPWLSELPENTS 

LQEHGVKLGPALTR\KVSCARGVDLETLTENKL\ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D'^Aspartic Acid, 
E^GIutamic Acid, F=PhenylaIanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr=»Lcucine, M=Methioninc, 
N=Asparagine, P'^Proline, Q=Glutamine, R'^Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=^top codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










HAEGIRSSRJREPYS*LRHGRCGI\P\EALVQRYAED 
LDQPERDVAAhnVIDQIRVKQLRKQHRMAIPSGGL 
TEICRKPVSPGCISXSVSDWLISIGLPMYAGTLSTA 
GFSTL\SQVPSLSHTCLQEAG\ITEiERHIRK\LLSAA 
RLFKLPPGPEAM 


3362 


A 


1 


4653 


FRGGVGYAHTLHLLPFAGSSVVLARARRTDRWT 

SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 

SISVEEGKENILHVSENVIFTDVNSILRYLARVAT 

TAGLYGSNLMEHTEIDHWLEFSATKLSSCDSFTS 

TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 

NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 

VGTKWDVSTTKARVAPEKKQDVGKFVELPGAE 

MGKVTVRFPPEASGYLHIGHAKAALLNQHYQV 

NFKGKLIMRFDDTNPEKEKEDFEKVILEDVAML 

HIKPDQFTYTSDHFETIMKYAEKLIQEGKAYVDD 

TPGEQIKAEREQRIESKHRKM>ffiKNLQMWEEMK 

KGSQFGHSCCLRAKIDMSSNNGCMRDPTLYRCK 

IQPHPRTGN*Y\NV\YPTYDFACPIVDSIEGVTHAL 

RTTEYHDRDEQFYWIIEALGIRKPYIWEYSRLNL 

NNTVLSKRKLTWFVNEGLVDGWDDPRFPTVRG 

VLRRGMTVEGLKQFIAAQGSSRSVVNMEWDKI 

WAFNKKVIDPVAPRYVALLKKEVIPVNVPEAQE 

EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 

TFSEGEMVTFmWGNLNITKJHKNADGKHSLDAK 

LNLENKDYKKTTKVTWLAETTHALPIPVICVTYE 

HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 

LKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEA 

PCVLIYIPDGHTKEMPTSGSKEKTKVEATBCNETS 

APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 

VVRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 

TGQEYKPGNPPAEIGQNISSNSSASILESKSLYDE 

VAAQGEVVRKLKAEKSPKAKINEAVECLLSLKA 

QYKEKTGKEYIPGQPPLSQSSDSSPTRNSEPAGLE 

TPEAKVLFDKVASQGEVVRKLKTEKAPKDQVDI 

AVQELLQLKAQYKSLIGVEYKPVSATGAEDKDK 

KKKEKENKSEKQNKPQKQNDGQRKDPSKNQGG 

GLSSSGAGEGQGPKKQTRLGLEAkK\EENLADW 

YSQVITKSEMIEYHDISGCYILRPWAYAIWEAIKD 

FFDAEIKKLGVENCYFPMFVSQSALEKEKTHVA 

DFAPEVAWVTRSGKTELAEPIAIRPTSETVMYPA 

YAKWVQSHRDLPIKLNQWCNVVRWEFKHPQPF 

LRTREFLWQEGHSAFATMEEAAEEVLQILDLYA 

QVYEELLAIPVVKGRKTEKEKFAGGDYTTTIEAF 

ISASGRAIQGGTSHHLGQNFSKMFEIVFEDPKIPG 

EKQFAYQNSWGLTTRTIGVMmVHGDNMGLVL 

PPRVACVQVVIIPCGITNALSEEDKEALIAKCNDY 

RRRLLSVNIRVRADLRDNYSPGWKFNHWELKG 

VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 

EAETKLQAILEDIQVTLFTRASEDLKTHMWANT 

MEDFQKILDSGKIVQIPFCGEIDCEDWIKKTTARD 

QDLEPGAPSMGAKSLCIPFKPLCELQPGAKCVCG 

KNPAKYYTLFGRSY 


3363 


A 


3797 


1514 


LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 

LAAPKETDCVLTQK\LI\ETLKPFGGFLKKEEGTA 

SRRNFNFGKN*INLVKEWIRRNQ*KAKNLPQSVI\ 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locstion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corrcdpuiiuiug 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=HisHdine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 

T=Threoiiine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=lInknowD, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










ENV\GGKIFT/FLGSYRL/GEVHTKGADIDGVCVF 

APRHVDRSDFFTiSFYDKLKLQEEVKDLRAVEEA 

FVPVIKLCFDGIEIDILFARLALQTIPEDLDLRDDS 

LLKNLDIRCIRSLNGCRVTDEILHLVPNIDNFRLT 

LRAIKLWAKRHNIYSNILGFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLVFSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPHTPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEELLSKAE 

WSKLFEAPNFFQKYKHYIVLLASAPTENQRLEW 

VGLVESKIRILVGSLEKNEFITLAHVNPQSFPAPK 

ENPDKEEFRTMWVIGLVFKKTENSENLSVDLTY 

DIQSFTDTVYRQAINSKMFEVDMKIAAMHVKRK 

QLHQLLPNHVLQKKKKHSTEGVKLTALNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGRNS 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

f?IPnTATOPATSPPPKPTVSRVVSSTRI VNPPPRSS 

GNAATSGNAATKIPTPIVGVKRTSSPHKEESPKK 

TKTEEDETSEDANCLALSGPDOKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTDLSDIPALPANPIP 

VIKNSIKLRLNR 


3364 


A 


54 


3073 


SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRGAVQGPASRAPER 

PRNRHVVREKTGAEEQAVKRRGKREL/LVHMDE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTKNTPCSENKLDIQEKKLINQEKKMFRI 

KNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

KNDLRYIEMQHFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRLPRKQGSILYCTTGIILQWLQSDPYLSSVS 

fflVLDEIHERNLQSDVLMTVVKDLLNFRSDLKVI 

LMSATLNAEKFSEYFGNCPMIHIPGFTFPWEYLL 

EDVIEKIRYVPEQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRYIVLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKFLIIPLHSLMPTVN 

QTQVFKRTPPGVRKIVIATNIAETSITIDDVVYVID 

GGKKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPGSLLFICINGS*EASLLGWTIQLPEIF/R 

GTPLEELCLQIKVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQL\RSLNALDKQEELTPLGVHLARLPVEP 

HIGKMILFGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTVVNAFEGWEEA 

RRRGFRYEKDYCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKIIKAVIC 

AGLYPKVAKIRLNLGKKRKMVKVYTKTDGLVA 

VHPKSVNVEQTDFHYNWLIYHLKMRTSSIYLYD 

CTEVSPYCLLFFGGDISIQKDNDQETIAVDEWIVF 

QSPARIAHLVKRAVVHMDERREEQIVQLLNSVQ 

AKNDKESEAQISWFAPEDHGYDKKYFFKE 


3365 


A 


439 


878 


ECCNVRPLRETDLLKMKRKPRASSPVVEEQPRA 

NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARKKNAPQKSMALRILEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=Alanine OCysteine, I>"Aspartic Add, 
E>=G!utamic Acid, F'^Phenylalanine, G^lycine, H=Histidine, 
I=l5oleucine, K=Lysine, LRLeucine, M=Methionine, 
i>p=ASparagine, r^xroiine, v^vriuiamine) ii— Arginine, o— serine, 
T=Threonine, V=VaIinc, W=^Tryptophan, Y=Tyrosinc, 
X<=Unknown, *^top codon, /=i)0ssible nucleotide deletion, 
V^'possible nucleotide insertion 


3366 


A 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 
KHAKKHLGFFRNNFGVREPYQILLDGTFCQAAL 
RGRIQLREQLPRYLMGETQLCTTRCVLKELETLG 
KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

FnQNTIsm.DKPSPKTIAFV^VESGVRLSQCMRK 
KVSNISKRNRV* *KTLNRGRRKKRKKISGPNPLS 
CLKXKKKAPDTQSSASEKKRKRKRIRNRSNPKV 

LSEKQNAEGE 


3367 


A 


40 


1467 


MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

INDDDSFTTFFSETGNGKHVPRAVMIDLEPTVVD 

EVRAGTYRQLFHPEQLITGKEDAANNYARGHYT 

VGKESIDLVLDRIRKLTDACSGLQGFLIFHSFGGG 

TGSGFTSLLMERLSLDYGKKSKLEFAIYPAPQVS 

TAVVEPYNSELTTHTTLEHSDCAFMVDNEAIYDI 

CRRNLDIERPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVPYPRIHFPLVTYAPIISAEKAYH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

IVJLLf I SSAJlJ V V iJSAJ V IN V /\Lrvr\lJV 1 JVI\. 1 1 v^r V L/ W 1 

GFKVGINYQPPTVVPGGDLAKVQRAVCMLSNTT 
AIAEAWARLDHKFDLMYAKRAFVHWYVGEGM 
EEGEFS*RPGEDLA\ALEVKDYEEVGTDSFEEENE 

GEEF 


3368 


A 


3 


2597 


SLLEETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNTREALRPCPRTVSTKAQPGRSASSSSG 

DKTTSFAEQKIRKLNHTDGESSGSSSQKTTPEGSE 

LNffHAGAWAQBPEETGLPQGRDTTQLLASEMV 

HLMMK\LKEKR\RAI*AQKKKMEAAFTKQRQKM 

GRTAFLTWKKKGDGISPLREEAAGAEDEKVYT 

DRAKEKESQKTDGQRSKSLADIKESMENPQAKW 

LKSPTTPIDPEKQGNLASPSEETLNEGEILEYTKSI 

EKLNSSLHFLQQEMQRLSLQQEMLMQMREQQS 

WVISPPQPSPQKQIRDFKPSKQAGLSSAIAPFSSDV 

SPRVPTHPSSTSLLNRKSASFSVKSQRTPRPNELKI 

TPLNRTLTPPRSVDSLPRLRRFSPSQVPIQTRSFVC 

FGDDGEPQLKESKPKEEVKKEELESKGTLEQRG 

HNPEEKEIKPFESTVSEVLSLPVTETVCLTPNEDQ 

LNQPTEPPPKPVFPPTAPKNVNLIEVSLSDLKPPE 

KADVPVEKYDGESDKEQFDDDQKVCCGFFFKD 

DQKAENDMAMKRAALLEKRLRREKETQLRKQQ 

LEAEMEHKKEETREUCTEEERQKKEDERARREFIR 

QEYMRRKQLKIJ^IEDMDTVIKPRPQVVKQKKQR 

PKSIHRDHffiSPKTPIKGPPVSSLSLASLNTGDNES 

VHSGKRTPRSESVEGFLSPSRCGSRNGEKDWEN 

AHCCLAGKVNEGQKKKILEEMEKSDANNFLILF 
RDSGCQFRSLYTYCPETEEINKLTGIGPKSITKKM 
lEGLYKYNSDRKQFSHIPAKTLSASVDAITIHSHL 
WOTK'RPVTPKKI T PTKA 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Tirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to lasi amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AIanine OCysteine, D^Aspartic Acid, 
E=Giutaroic Acid, F=PhenyIalanine, GK^Iycine, U=Histidioe, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^^erine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=lInknown, *=Stop codon, ^possible nucleotide deletion, 
\»possibIe nucleotide insertion 










YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSIODRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHKRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3371 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHUD 

YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3372 


A 


239 


3348 


PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 

MSDDVHSLGKVTSDLAKRRKLTS\*GGLSEELGS 

ARRSGEVTLTKGDPGSLEEWETVVGDDFSLYYD 

SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 

EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 

KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 

GSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSND 

TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 

CMATESVDGELSGCNAAILKRETMRPSSRVALM 

VLCETHRARMVKHHCCPGCGYFCTAGTFLECHP 

DFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 

QEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRAD 

TSQPSARMRGHGEPRRPPCDPLADTIDSSGPSLTL 

PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 

LRFHPRQLYLSVKQGELQKVILMLLDNLDPNFQS 

DQQSKRTPLHAAAQKGSVEICHVLLQAGANINA 

VDKQQRTPLMEAVVNNHLEVARYMVQRGGCV 

YSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVD 

VNAQDSGGWTPIIWAAEHKHIEVIRMLLTRGAD 

VTLTDNEENICLHWASFTGSAAIAEVLLNARCDL 

HAVNYHGDTPLHL^iARESYHDCVLLFLSRGANP 

ELRNKEGDTAWDLTPERSDVWFALQLNRKLRL 

GVGNRAIRTEKIICRDVARGYENVPIPCVNGVDG 

EPCPEDYKYISENCETSTMNIDRNITHLQHCTCV 

DDCSSSNCLCGQLSIRCWYDKDGRLLQEFNKIEP 

PLIFECNQACSCWRNCKNRVVQSGIKVRLQLYR 

TAKMGWGVRALQTIPQGTFICEYVGELISDAEAD 

VREDDSYLFDLDNKDGEVYCIDARYYGNISRFIN 

HLCDPNIIPVRVFMLHQDLRFPRL\FFSSRDIRTGE 

ELGFDYGDRFWDDCSKYFTCQCGSEKCKHSAEAI 

ALEQSRLARLDPHPELLPELGSLPPVNT 


3373 


A 


587 


1584 


PDGRLIVSCSEDKTIKIWDTTNKQCVNNFSDSVG 

FANFVDFNPSGTCL\SAGSDQTVKVWDVRVNKL 

LQHYQVHSGGVNCISFHPSGNYLITASSDGTLKIL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSP 

PHLLDF^RTPHPHEEKVETVEDFFLHLLRLIQSL 

R*SICRSLLPLLWISFLLILPQQQKPWGLCQTRV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

A t>i* AC n A n H 1 n o 

vurrcapuuulug 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=PbenylaIanine, G^GIycine, H=Histidine, 
I=IsoIeucjne, K=Lysine, L=Leucine, M=Methionine, 

IMzsAcnofiioinp P:=Prnlinf' Os=4vliitAtTiinf^ R^ArcininfL. Ss^priilC 

1^ /msUal dKllIC) JT jr 1 UlillV) \^ u mini lib y IX r%i giiiiiivy w wi lu^) 

T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletioo, 
\=pos5ible nucleotide insertion 










KRPVDIS*TLP*CHQNVCQQPRKRKQKT*VTSPV 

KVKA^SIPLAVTDALEHIMEQLNVLTQTVSILEQR 

LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPMALSILDIKMSPSWYFHMAIGIINWNTTAG 
LSGTLYPKVPQKYILFDSVILLLGMLRKIRQVCQ 
NVYMKGCSPITLFKIVHYWPGAVAHAYNPSTLG 
GQVG/WQIT*GQEFETSLDYMVKPHLY 


3375 


A 


3 


1051 


VPTQQILAFPEQTNTKDWTVTPEHVLPESQSLLT 

FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 

ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 

DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 

VISKKAKVKVPQKTAGKENHFDMHRVGKWHQ 

DFPVKKRKKLSTWKQELLKLMDRHKKDCAREK 

PFKCOECGKTFRVSS\DL\IKHORIHTEEKPYKCO 

QCDKRFRWSSDLNKHLTTHQGIKPYKCSWGGKS 

FSQNTNLHTHQRTHTGEKPFTCHECGKKFSQNS 

HLIKHRRTHTGEQPYTCSICRRNFSRRSSLLRHQK 

LHL*REACPVSHFWKTF 


3376 


A 


137 


2329 


SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 

GVGDSEGGPRPLFCRKGALRQKVVHEVKSHKFT 

ARFFKQPTFCSHCTDFIWGIGKQGLQCQVCSFVV 

HRRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 

SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 

RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 

HVTVGEARNLIPMDPNGLSDPYVKLKLIPDPRNL 

TKQKTRTVKATLNPVWNETFVFNLKPGDVERRL 

SVEVWDWDRTSRNDFMGAMSFGVSELLKAPVD 

GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 

CNYPLELYERVRMGPSSSPIPSPSPSPTDPKRCFFG 

ASPGRLHISDFSFLMVLGKGSFGKVMLAERRGSD 

ELYAIKILKKDVIVQDDDVDCTLVEKRVLALGG 

RGPGGRPHFLTQLHSTFQTPDRLYFVMEYVTGG 

DLMYHIQQLGKFKEPHAAFYAAEIAIGLFFLHNQ 

GIIYRDLKLDNVMLDAEGHIKITDFGMCKENVFP 

GTTTRTFCGTPDYIAPEIIAYQPYGKSVDWWSFG 

VLLYEMLAGOPPFDGEDEEELFOAIMEOTVTYP 

KSLSREAVAICKGFLTKHPGEAPGASGP*WGNLT 

IRAHGFFPLGFDWERLERL\EIPASFSRPRPCGPQR 

RGIFDKFFTRAAPA\LTPPARLVLDSIDQADFQGF 

TYVNPDFVQPDARSPTSTVHVPVM 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCWLSSSQVGISAACKFS 
TLTHTHTHTHTHTRHAPFCGTCLYY 


3378 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLFNYKPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 

RRNTTNPS/CK*IRKLQGVI 


3379 


A 


1126 


456 


FSKLIMKTOIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSWSTDQESAEEIPILIIEG 

FLLFNYKPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino ' 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D^Aspartic Acid, 
E^lutamic Acid, F<°PhenyIalanine, G=Glycine, H^Histidine, 
Msoleucine, K=Lysine, L=Leucine, M°4Methionine, 
N^Asparagine, F=Proline, Q^GIutamine, R=Arginine, S^Serine, 
T^Threonlne, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










RKNTTNPS/CK*IRKLQGVI 


.3380 


A 


1443 


794 


ARRGELAGGGRASGGRSGGDGGGGGGARAPEG 

VRAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRRDYLRLFGQDGLCASCDKRIRAYEMTMRV 

KDKVYHLECFKCAACQKHFCVGDRYLLINSDIV 

CEQDIYEWTKINGMI 


3381 


A 


945 


474 


SLKLRKPPLPTDGVHFVFVESQLDFWGPQEMLT 
QQGMALQNYDNKLVKCIEELCQKQEELCWQIQ 
QEEDKKQRLQNEVRQLTEKLACVNEKLARVNE 
NLARKIASCSKFYQTIAETEATYLKILESF*\TLLS 
VRKREAGNLTKATAPDQKSSGGRDS 


3382 


A 


1 


1458 


GIRGKMADRGGVGEAAAVGASPASVPGLNPTLG 

WRERLRAGLAGTGASLWFVAGLGLLYALRIPLR 

LCENLAAVTVFLNSLTPKFYVALTGTSSLISGLin 

FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSRNRENETSRQNLSECKVWRNPLNLFRGA 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 

LN/E*LCHCICSTG*GRSNNYCRC*KVI*TGTQGR 

RNNL*AVTAVPAPKSSA*SSTEERYQCTG1Y*LKI 

GNVCKKIRKNKRSSKNNERFDE*ISSSYHVEHP* 

KSLVKSLLELQAYPDVQAVLAKYDDISLPKSAAIC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPHVPKYLLEMKSLILPPEHILKRGDS 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

FPKVTLISLTIH 


3383 


A 


282 


2443 


RGKGFKEFFLGVCQTFIPCLCAEGIQLQFFCSGSG 

SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 

DHSKPTAETVAPDNTAIPSLRAEAEENEKETAVS 

TEDDSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 

QELGIEGFKRDSDGSL*VWNL\EYGTNLKGTLDI 

KEDMSEPQEKKLSENTDFLAPGVSSFTDSNQQES 

ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 

NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 

\LPREHANSKQEEDNTQSDDILEESDQPTQVSKM 

QEDEFDQGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQEGKTGLEAISNHKETEEKTVSEALLME 

PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDDYFHPKPGLFWEAERA\HS1AYSPSKLREQ 

REKVHENENIGTTEPGEHQEAKKAENSSNEEETS 

SEGNMRWHAVDSCMSFQCKRGHICKADQQGKT 

SLVSCQDPVT\CPPTKPLDQVCGTDNQTYASSCH 

LFATKCRLEGTKKGHQLQLDYFG\ASKSIPTiCRD 

FEVIQ\FPLRMRDW\LKNILMQLYEANSEHAGYL 

NEK\QRNKVKK1YL\DEKRLLAGDHPIDLLLRDFK 

KNYHMYVYPVHWQFSELDQHPMDRVLTHSELA 

PLRASLVPMEHCITRFFEECDPNKDKHITLKEWG 

HCFGKEEDroENLLF 


3384 


A 


3166 


928 


PSRPHPTHAAMAGPEGFQYRALYPFRRERPEDLE 

LLPGDVLVVSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFPGTYVEFLGPVALARPGPR 

PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 

PPLLVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»A(anine C=Cysteinc, D=Aspartic Acid, 
E=G!utamic Acid, F^Phenylalanine, G=Glycine, H=Histidinc, 
I==IsoIeucine) K~Lysine, L=Leucine, M=MetliioniDe, 

NTiK A cn<ir<inirk» I'—T'mlinff ^sl^liito illillP 12^ A riTini nP ^=^PrinP 

i^~~/\sp2r3giiiC) i — rruiincj ^^~\jiuiaiiiiuc) j\-'/\rgiiiiiiC) o — oci lucj 
T=Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X=Unl{nown, *^top codon, A^^possible nucleotide deletion, 
\»possible nucleotide insertion 










SDVDQWDTAALADGIKSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRPLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDMSREEVNEKLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLnCVFHRDGHYGFSEPLTF 

CSVVDLINHYRHESLAQYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFNETIKIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

LKSRIAXEIHESRTMCLVEQQLLVPRASDNKRD/IDK 

PH*TSLKPDLMQLRKIRDQYLVWLTQKGARQKK 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 
SVVVDGDTKHCVIYRTATGFGFAEPYNLYGSLK 
ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 
PPAAR 


3385 


A 


43 


2372 


TRD\nNSWKELCFNHYNKETTNCYRTTRKWTNY 

KIIFLGPFRELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHfflSHSTSAGPIPSQKEEEMTESQ 

GTVTFKDVAIDFTQEEWKRLDPAQRKLYRNVML 

*NYNNLITVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNNVKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VO^CGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTRE/KPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKfflSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLUHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNBC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTRENPLSVirVEKASIRLWTSSDI 
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SEQDD 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location - 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alaoine C=Cysteine, INAspartic Acid, 
E=Glutamic Acid, F=Phenyla]anine, G=Glycine, H=Histidine, 
I=:Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GiutaminC) R==Arginine, S^Scrine, 
T=Thrconine, V=Vaiine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, A^possible nucleotide deletion, 
\=possible nucleotide insertion 


3386 


A 


201 


1032 


WDDYPQGALRKREAAEGLHFLGPPGRVRGQLR 

GITGPAWYCHSPSHSLLSAFCHLPTPSRCPAMAR 

PPVPGSVWPNWHES/RRGQGVPGLHSAQEPPAG 

VWAA*AASAAAAVLSIDTASYKIFVSGKSGVGKT 

ALVAKT AGLEVPVVHHETTGIOTTVVFWPAKLO 

ASSRVVMFRraFWDCGESALKKFDHMLLACME 

NTDAFLFLFSFTDRASFEDLPGQLARIAGEAPGV 

VRMVIGSKFDQYMHTDVPERDLTAFRQAWELPL 

LRVKSVPGRRLG 


3387 


A 


86 


96 


GSSPDPASLITNIKNQDKKNGAAKQSNPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVARNGEPE 

PTPWNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

PQEKKKAKGLGKEITLLMQTLNTLSTPEEKLAAL 

CKKYAELLEEHRNSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKRKEVTSHFQVTLNDIQLQMEQ 

HNERNSKLRQENMELAERLKKLffiQYELREEHID 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

LYTEKFEEFQNTLSKSSEVFTTTKQEMEKMTKKI 
KKLEKETTMYRSRWESSNKALLEMAEEKTVRD 
KELEGLQVKIQRLEKLCRALQT/GAQ*PVRGQRW 
GSHRTSAVRJDFS 


3388 


A 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKK 

NKGICERRDLDDLKKEVAMTEHKMSVEEVCRKY 

NTDCVQGLTHSKAQEILARDGPNALTPPPTTPEW 

VKFCRQLFGGFSILLWIGAILCFLAYGIQAGTEDD 

PSGDNLYLGIVLAAWnTGCFSYYQEAKSSKIME 

SFKNMVPQQALVIREGEKMQVNAEEWVGDLV 

EKGGDRVPADLRIISAHGCKVDNSSLTGESEPQT 

RSPDCTHE\NPLKTRNITFFSNNFVEGTARGVVVA 

TGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLIT 

GVAVFLGVSFFILSLILGYTWLEAVffLIGnVANV 

PEGLLATVTVCLTLTAKRMARKNCLVKNLEAVE 

TLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF*H/LLGFCNR 

PVFKGGQDNBPVLKRDVAGDASESALLKCIELSS 

GSVKLMRERNKKVAEIPFNSTNKYQLSIHETEDP 

NDNRYLLVMKGAPERJLDRCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAA 

VPDAVGKCRSAGIKVIMVTGDHPITAKAIAKGV 

GIIFEGNETVEDIAARLNIPVSQVNPRDAKACVIH 

GTDLKDFTSEQIDEILQNHTErVFARTSPQQKLirV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMILLDDNFASIVTGVEEGRLI 

FDNLKKSIAYTLTSNIPEITPFLLFMANIPLPLGTI 

TELCIDLGTDMVPAISLAYEAAESDIMKRQPRNPR 

TDKLVNERLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQW 

TYEQRKWEFTCHTAFFVSIVWQWADLIICKTR 

RNSVFQQGMKNKILIFGLFEETALAAFLSYCPGM 

DVALRMYPLKPSWWFCAFPYSFLIFVYDEIRKLI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
&=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
Msoleucine, K=Lysine, LF^Leucine, M^Methionine, 
i^^AoparaginCr i *=jrruiinCf \^^v»iuiaiinncj iv^^/^i gi ninC) o^ocniiCf 
T=Threonine, V=Valine, W^Tryptophan, Y^TyrosInc, 
X=Unknown, *=Stop codon, /=posslble nucleotide deletion, 
^possible nucleotide insertion 










LRRNPGGWVEKETYY 


3389 


A 


45 


5250 


VERLLGCRNSKRTWRMLISKNMPWRRLQGISFG 

MYSAEELKKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 

LCQLRVLEVGALQAVYELERILNRFLEENPDPSA 

SEIREELEQYTTEIVQNNLLGSQGAHVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRSVVRKEHNS 

KLTITFPAMVHRTAGQKDSEPLGIEEAQIGKRGY 

LTPTSAREHLSALWKNEGFFLNYLFSGMDDDGM 

ESRFNPSVFFLDFLVVPPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDVVLIRKLLALMAQEQKLPEE 

VATPTTDEEKDSLIAIDRSFLSTLPGQSLIDKLYNI 

WIRLQSHVmWDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAARSVICPDMYINTNEI 

GIPMVFATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 

DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 

KDGQPLAGLIQDHMVSGASMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQVV 

STLLmilPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGVEDILVKPKADVKRQRIffiESTHCGPQ 

AVRAALNLPEAASYDEVRGKWQDAHLGKDQRD 

FNMIDLKFKEEVNHYSNEINKACMPFGLHRQFPE 

NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 

TPLMASGKSLPCFEPYEFTPRAGGFVTGEUFLTGIK 

PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIK 

HLEGLVVQYDLTVRDSDGSWQFLYGEDGLDIP 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKKALHHFRAIKKWQSKHPNTLLRRGAFLSYSQ 

KIQEAVKALKLESENRNGR/RPWDS/G/RMLRMW 

YELDEESRRKYQKKAAACPDPSLSVWRPDIYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQL\KWQRSLCEPGEAVGLLAAQSIGEPST 

QMTLNTFHFAGRGEIVINVTLGIPRLREILMVASA 

NIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCL 

GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 

HAYYQQEKCLRPEDILRFMETRFFKLLMESIKKK 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVDYESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEEVGL/GH*GGPVPSRPPDAAPETHP 

QPGAPGA\EAMERRVQAVREIHPFIDDYQYDTEE 

SLWCQVTVKLPLMKINFDMSSLWSLAHGAVIY 

ATTCfrTTRrT T NFTTMNKMFKFT VLNTFOINT PELF 

KYAEVLDLRRL YSNDIHAIANTYGIEAALRVIEK 

EIKDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELRSPSACLWGKWRGGTGLFELKQPLR 


3390 


A 


2 


2080 


ILPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
LDEIDAYAVLELINSELICEMERPELDELTLERVLE 
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S£Q0> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoIeucinc, K=Lyslne, L=Leucine, M=Methionine, 
ii'*'Asparaginc, roiincj v^^viiuidiiiiiic, iv r&igiuiuc, o~^criiJC) 
T«Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *==Stop codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 










ELETLCHQNMARAIETQEGLGIEYDEDWCDVC 

RSPEGEDGNEMVFGDKChA^CVHQACYGILKVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWVHVSCALWIPEVSIGCPEKMEPITKISHIPASR 

WALSCSLCKECTGTCIQCSMPSCWTAFHVTCAF 

DHGLEmmADNDEVKFKSFCQEHSDGGPRNE 

PTSEPTEPSQAGEDLEKVTLRKQRLQQLEEDFYE 

LVEPAEVAERLDLAEALVDFIYQYWKLKRKANA 

NQPLLTPKTDEVDNLAQQEQDVLYRRLKLFTHL 

RQDLERVRNLCYMVTRRERTKHAICKLQEQIFH 

LQMKLIEQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QITAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

nGnGPPTRKPPPRTRSHT PSSPAAGDrPTT ATPFS 

PPPLAPETPDEAASVAADSDVQVPVGPAASPKPLG 

RLRPPPREPR+mPlLPGC/ARPDAGDGDHLSAVA 

ERPKV\SLHFDTETDG\YFS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEWVRMGVLAS 


3391 


A 


1555 


327 


nsflhflhlkvrtmflfpsfpvlllsvvtascskt 

kacadtqktcsmitcgipvtngtpgrdgrdrpk 

gekgepglgqvsvas*1stsgrcssksvlepatrg 

lkhrlgeaplssgpmlhseqpl*naiasktklfv 

dslgshistqelgvcgcpfrgvsclvgelalvqa 

lh*vagesfffgsdhwligcaggeqewsiei:lgk 

kkrvtatgssslclatgqglrglqgppgkmgpp 

gntgtsgipgprgqkgdrgdnsvaeaklanler 

JVA-» >3JLiXVJ.l7«l 'i-JLX Jl JVXVL» X X^ 0XjVJJV\iV10vJr\J\JLvr V- X I 'iVJJL-f 

RMPFSKVKALCAGLQATVAAPKNAEENKAIQDV 
AKDTAFLGITDEATEGQFMYLTGGRLTYSNWKK 
DEPNDHGSGEDCVILLNNGLWNGISCTSSFIAICE 
FPA 


3392 


A 


218 


1773 


GGSRRNQRRSIPVLGYFLKQKKMTKAQESLTLE 

DVAVDFTWEEWQFLSPAQKDLYRDVMLENYSN 

LVSVGYQAGKPDALTKLEQGEPLWTLEDEIHSP 

AHPEIEKADDHLQQPLQNQKILKRTGQRYEHGR 

TLKSYLGLTNQSRRYNRKEPAEFNGDGAFLHDN 

HEQMPTEIEFPESRKPISTKSQFLKHQQTHNIEKA 

HECTDCGKAFLKKSQLTEHKRIHTGBCKPHVCSL 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEKPYGCSECGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRKSLLVVHQR 

THTGEKPHTCSECGKGFIQKGNLNfflQRTHTGEK 

PYGCIDCGKAFSOKSCLVAHORYHTGKTPFVCPE 

X 1 vJ V->XX-/ WVJXV«r\X »JV^XV0^X-» V xAXXN^XX l XXX \JXVXX X T WX i-i 

CGQPCSQKSGLIRHQKIHSGEKPYKCSDCGKAFL 
TKTMLIVHHRTHTGERPYGCDECEKAYFYMSCL 
VKHKRfflSREKRGD/CSEGGKSFHSKSQLKS**TC 
AGEKPC*YGNCGNGGRAV 


3393 


A 


46 


1464 


ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSIETL 

SLPPGPSHLVGDKSQGGRSCQGQITSAASGKTSK 

SEPNHVlFKKISRDKSVTVrYLGNRDYMDHVNSOV 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTPTKLQ 

ESLLKKLGSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATDSTDAEEDKIPKKSSVRL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D^'Aspartic Add, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine» H-Histldine, 
I=Isoleucine, K=Lysine, L;=Leucine, M=Metbionine, 
N=Asparaginc, P=Prolinej Q^Glutamine, R^Argininc, S^SerinCi 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nudeotide deletion, 
\»possible nucleotide insertion 










LIRKVQHAPLEMGPQPRAEAAWQFFMFODKPLH 

LAVSLNKRDLFPMGSPIPVPVSVPVNNTEKPVKKI 

KA\SVFOVANVVT YSWDYXYVKPVAMFFAOFKV 

PPNSTWTKA\LTLL\PWLV>JNRERRGIALDGKIKH 

EDTNLASSTIIKEGIDRKRSWEILVSYPDQR*SSTV 

SGFLGRASPSQ*SRrPRSQFRLVMHPQP\EDPA\K 

ESYQDANLVF\EEFARP*ILKDAGEA*\EGKRDQE 


3394 


A 


211 


1591 


RPPTMAADQRPKADTLALRQRLISSSCRLFFPEDP 

VKIVRAQGQYMYDEQGAEYTOCISNVAHVGHCH 

PLVVQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEQLCVFYFLNSGSEANDLALRLARHYTGH 

QDWVLDHAYHGHLSSLIDISPYKFIWLDGQKE 

WVHVAPLPDTYRGPYREDHP\THVEDGLEKAFS* 

KRVVQGKNRQICRRQIAAFFAESLPSVGGQIIPPA 

GYFSQVAEHIRKAGGVFVADEIQVGFGRVGKHF 

WAFQLQGKDFVPDIVTMGKSIGNGHPVACVAAT 

VLEKEQLQDHATSVGSFLMQLLGQQKIKHPIVG 
DVRGVGLFIGVDLIKDEATRTPATEEAAYLVSRL 
KENYVLLSTDGPGRNILKFKPPMCFSLDNARQV 
VAKLDAILTDMEEKVRSCETLRLQP 


3395 


A 


1 


1424 


FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVIWG 

PITERKKKRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDEVLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVITSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCRTRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLSIPAA 

TWGHPLTLDHCLHHFISSESVRDWCDNCTKIEA 

KGTLNGEKVEHQRTTFVKQLKLGKLPQCLCIHL 

V^ivL/o w oonvj 1 i Jbisj^jTLCxi V v^r iNJ&r juivixvil/i i pw x ni^ 

lghkpsqhnpklnknpgptlelQdgpgaptpgl 
nqpgapktqifmngacspsllptlsapmpfplpv 
vpdyssstylfrlmgscrppwetwhsgtlcsftd 

GPHL 


3396 


A 


109 


107 


tqeagliffsppfslslslslplslfllshphsrtpp 

nrtprrtripqrpavmysplcltqdefhpfieall 

phvrafaytwfnlqarkrkyfkkhekrmskee 

eravkdellsekpevkqkwasrllaklrkdirp 

eyredfvltvtgkkppccvlsnpdqkgkmrrid 

clrqadkvwrldlvmvilfkgiplestdgerlv 

kspqcsnpglcvqphhigvsvkeldlylayfvh 

aadssqsespsqak*r*h*gparkwdiwgfq\ds 

fv'nsgvf\svt*a*lrvsqtpi\aag\tgpnfslsd 

lesssyysmspgamrrslpstsstsstkrlksved 

emdspgeepfytgqgrspgsgsqssgwhevepg 

mpspttlkkseksgfsspspsqtsslg\taftqhhr 

pairyhpqetlkefvqlvcpdagqqagqpngss 
qgkvhnpflptpmlpppppppmarpvplpvpdtk 
ppttsteggaasptspttrs/pgrtrpqqpfl/syg 
pp*psnaliggggggagerageradlem 


3397 


A 


1 


2002 


tgtltedgldvmgvvplkgqaflplvpeprrlp 
vgpllralatchalsrlqdtpvgdpmdlkmves 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M-Methionine, 
N=Asparagine, P='Proline, Q=GIutamine, R-Arginine, S^erine, 
T^Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLQ 

AMEEPPVPVSVLHRFPFSSALQRMSVVVAWPGA 

TQPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 

YTAAGYRVVALASKPLPSVPSLEAAQQLTRDTV 

EGDLSLLGLLVMRNLLKPQTTPVIQALRRTRIRA 

VMVTGDNLQTAVTVARGCGMVAPQEHLIIVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLALSGPTFGIIVKHFPKLLPKVLV 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASVVSPFTSSMA 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

S VLILYTINTNLGDLQFLAIDL VimVAVLMSRT 

GPALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVVFSLSSFQYLILAAAVSKGAPFR\RPLTNNVPF 

LLASAL*SSVLVVLVLSPGLLHGPLALRNITDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AQAGVSKKRFKQLERELAEQPWPPLPAGPLR 


3398 


A 


758 


1368 


FPFRMLTGYLYLMWRRKAFWSGTQRHPLPGGL 

KRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 

RPGQGE/PGLISPKPVTEVLPDVQGAPVPVPPLPT 

PPSLPHLQNQPP/TVQHYLLSFSWKPSQGPE*RA* 

PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 

TSPPGKGFQKTETRKHPPPRQQHKPKCTANRPLA 

SFL 


3399 


A 


906 


1091 


HHHHHHHHHHHHHLVAFGKVQ*LQNSPSSSSSS 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 


3400 


A 


1838 


325 


PFLSVHRSPHGPSKLCDDPQASLVPEPVPGGCQE 

PEEMSWPPSGEIASPPELPSSPPPGLPEVAPDATST 

GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

PVKNPCSVKDQTPLQLSVEDTTSPNTKPCPPTPTT 

PETSPPPPPPPPSSTPCSAHLTPSSLFPSSLESSSEQ 

KPYNFVILHARADEHIALRVSGRSWEALGVPDG 

ATFCEDFQVPGRGELSCLQDAIDHSAFIILLL-nSN 

\FDCR\LSLHQVNQAMMSNLmQGSQDCVlP\FLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

NTFKPHRLQARKAMWRKEQDTRALREQSQHLD 

GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 

AFGSHMSFGTGAPYGARMPFGGQVPLGAPPPFP 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 

FPAVPKPFPTASTAPPSEPKGWQP\LIIHHAQMVT 

SWG*NKH\MWNQRGSQAPEDKTQEAE 


3401 


A 


153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 

KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AEKKSSCFBBWGLPGTKNKTNDLDFSTSSLSRSK 

WAGMGNSGITTELTLKYnTNVTTLETGISSVNA 

GQDVNniTYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 

PKLLFRLTVnLTFKCYYVLFHLHNARVLDV 


3402 


A 


153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

mrrpcnnndirtP 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=>A5partic Add, 
£=^GIutamic Acid, F^Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, I^Leuclne, M^^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Ar^inine, S^^^SerinCi 
T=Threoninc, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^'Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

WAGMGNSGITTELTLKYIITlSrmLETGISSVNA 

GQDWniTyKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3403 


A 


609 


2765 


SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKVVTFHKLKL 

TNNISDKHGFTILNSMHKYQPRFfflVRANDILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGRREKRXQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDRSPSRG*RATPEAEEQRGSTAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREG\QA 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

AFSSMAAAGMGPLLATVSGASTGVSGLDSTAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYPYTYMAAAAAA/SSAAASASVHRT 

P\FNLNTMRPRLRYSPYSIPVPVPDGSSLLTTALPS 

MAAAAGPLDGKAAALAASPASWAVDSGSELNS 

RSS\TLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 

GLEAKPDRSRSASP 


3404 


A 


1082 


1308 


LKKFLEVPQSYSLLLSSPFLQ\WRA*RPQNAIG*Q 
FIIKTLVFFGIMRSAGDVLSTQVSCALRIMRTAGC 

SHSSP 


3405 


A 


1553 


559 


PRPPTQRLSRFAPPCRTAEFPFRRRAVVTRPAPPR 

ACTVVGRSSPVTGLAVGAAVAMLTVAARSRPFA 

PVLSATSRGVAGAL'nP*MQATVPATPEQPVLDL 

KRPFLSRESLSGQAVRRPLVASVGLNVPASVCYS 

HTDIKVPDFSEYRRLEVLDSTKSSRESSEARKGFS 

YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 

LALAKmiKLSDIPEGKNMAFKWRGKPLFVRHRT 

QKEIEQEAAVELSQLRDPQHDLDRVKKPEWVILI 

GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 

RIRLGPAPLNLEVPTYEFTSDDMVIVG 


3406 


A 


83 


2671 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 

DPVAFKDVAVNFTQEEWALLDISQKNLYREVML 

ETFWNLTSIGKKWKDQNIEYEYQNPRRNFRSVT 

EEKVNEIKEDSHCGETFTPVPDDRLNFQKKKASP 

EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

KKPYACKECGKJsfllYHSSIQRHMVVHSGDGPYK 

CKFCGKAFHWLSLYLIHERTHTGEKPYECKQCG 

KSFSYSATHRIHERTHIGEKPYECQECGKAFHSPR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

^ i\ vi^c n /\ n n I n n 

Lur r capuiiuiiig 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lyslne, L^Leucine, M=Metbionine, 

i^*-'/\apiiriigiiiC| r"~jr ruiiiic, vc^viiui.<tiiiiiic, n~/\rgiiiiiiC) o^ocrinc, 

T=ThreoniDe, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, A^possiblc nucleotide deletioa, 
^possible nucleotide iosertion 










SCHRHERSHMGEKAYQCKECGKAFMCPRYVRR 

HERTHSRKKLYECKQCGKALSSLTSFQTHIRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCKQCGKAFTRSGSFRYHERTHTGEKPYECKQC 

GKAFRSAPNLQSHGRTHTGEKPYECKECGKAFIF 

VNNLQSHERTQTHIRIHSGERRYKCKICGKGFYC 

PKSFQRHEKTHTGEKLYECyrATFSSSFSSSSSF*Y 

HERTHTGEKPYKCEQCGKAFRAVSIL*MHGRTH 

PEEKPYECEQ*RKAFRSAPHL*IRGRTHNGEiCPY 

ACKKCGKPFGSAQNLRIHERTQTHIMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KNLRFHKRTHTGEKPCEYMKRLTLEGNTMNAS 

>n/AKLSLLPVLFMMKEFILGRNPISVSNVRKPLF 

LPLLFNIMKGLTWERNPMSVCHVGKPSFLLVPFN 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRFFEYR 


3407 


A 


1426 


3 


PAAPSGASPGRVCGVETARPLGVQRRQSADEGP 

PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 

PAVTRAAQAATMVKLLVAKILCMVGVFFFMLL 

GSLLPVKIIETDFEKAHRSKKILSLCNTFGGGVFL 

ATC\LTALLARC*GKSSRRSWSLGH1STDYPL\AE 

TILLLGFFMTVFLEQLILTFAQENAVLHRPGDLQR 

RIGRGQRLGV*EPLHGGRAGPRAVRGAPRPRPQP 

ERAGPLA\PSPVRLLSLAFALSAHSVFEGLALGLQ 

EEGEKVVSLFVGVAVHETLVPVALGISMAGSAM 

PLRDAAKLAVTVSPMIPLGIGLGLGIEKAQGVPG 

DRLLKVLF\LVVGYTVLAGMGLPQVVSGLAIVPA 
AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 
KGPPGTRLCPRSYTLSLRALLLFKILLSLKSLYQK 
KK 


3408 


A 


106 


4514 


EARDRLAQSRAKEKELNSVASELSARQEESEHSH 

KHLIELRREFKKNVPEEIREMVAPVLKSFQAEVV 

ALSKRSQEAEAAFLSVYKQLIEAPALWELKLKSR 

PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 

LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP 

SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 

LLELRRKYDEEAASKADEVGLIMTNLEKANQRA 

EAAQREVESLREQLASVNSSIRLACCSPQGPSGD 

KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 

SSLQELEEASANQIADLERQLTAKSEAIEKLEEKL 

QAQSDYEEIKTELSILKAMKLASSTCSLPQGMAK 

PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 

DSnCDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 

LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF 

KGEAGGLLVFPPAFYGAKPPTAPATPAPGPEPLG 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

QLLKHNIGQRVFGHYVLGLSQGSVSEILARPKP\ 

WRKLHG* *GKEPFIKMKQFLSDEQNVLALRTIQV 

RQRGSITPRIRTPETGSDDAIKSILEQAKKEIESQK 

GGEPKTSVAPLSIANGTTPASTSEDAIKSILEQAR 

REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 

ASQNGAPALVKQEEGSGGPAQAPLPVLSPAAFV 

QSIIRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=A]anine 0=Cysteine, D=Aspartic Acid» 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I^^Isoleucine, K=Lysine, L^Leudne, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=GIutamine, R=Arginine, S^erine, 
T=Thrconinc, V=VaIine, W=Tryptophan, Y=TyroslDC, 
X»Unknown, *<^top codon, /^possible nndeotide deletion, 
\=possible nucleotide insertion 










VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARLPYYP 

AYVPRTLKPTVPPLTPEQYELYMYREVDTLELTR 

QVKEKLAKNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQEIV 

AMSPELDTYSITKRVKEVLTDNNLGQRLFGESIL 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL 

WLNDPHNVEKLRDMKKLEKKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQIKKPRVVL 

APEEKEALRKAYQLEPYPSQQTIELLSFQLNLKT 

NTVINWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDPTPQSPDSETEDQKPTVKELEL 

QEGPEENSTPLTTQDKAQVRIKQEQMEEDAEEE 

AGSQPQDSGELDKGQGPPKEEHPDPPGNDGLPK 

LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 
VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 
SAKVNPNLQRRHEKMANLNNIIYRLERAANREE 
ALEWEF 


3409 


A 


162 


1710 


GPLSPGPYQCRPSLPAQLYPQSLMAAATLRTPTQ 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AHTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHRR 

GHTAVRPHECDECGKLFSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFSHNSSLIKHQRIHSG*\RPYECTECGKSF 

SQNSSLIEHHRVHTGERPYKCSECGKSFRQRSAL 

T niTPrivPTriPppvPPQFr'r^i^'FFPVQ^QT nK"T4rvp\/ 
ijv^rixvvJ V Jr JL vjnivr i i_.v^dc/V^vjjsj^ r i i oODi-#oivriv^i\. v 

HTGSRPYECSECGKSFTQNSGLIKHRRVHTGEKP 

YECTE*KKSFSHNSSLIKHQRfflSR*KPYEVCKCG 

N\R*HPGESP*VHSECQ/KSFS*RPYL1ECHTVHKG 

KTLLICRDVQLI 


3410 


A 


167 


789 


LCMKGISGGVRVAALAARAEREELPVPAMEPQP 
TAWGSPHPEAVLQLEVAPESSGPCTDTAKDQQS 

HGPSQQLPRCP*SWAWSEPWCQRPGCAV*APLP 
Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 
V/GEQEKEAVRKGSGSSSCSQRGP\PPPGMEVCPL 
LGFWAICP 




A 




oo / 


TQPPSVSKDLR\QTATLTCTGNSNNVGHQGVIWL 
QQHQGHPPKLLSYRNNNRPSGISERLSAYKSGNA 
ASLTIYGLQTEHEAD* *CRPRRKLIPKTARLFFFFL 
IDNEEYLLRVY 


3412 


A 


164 


83 


RRGIPGSASLSLTMCVRSCFQSPRLQWVWRTAFL 

KHTORRHnGSHRWTHLGGSTYRAVIFDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLCSEMLKTSVPVD 

SFFSLLTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYLPNQKSFLPLDRKQFDVIVESCMEGICKP 
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SEQiD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine C^Cysteine, D»Aspartic Acid, 
E=Grutamic Acid, F=Phenylalanine, G=Glycinc, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
IN— Asparagine, r=r roiine, v^vijuiamine, K—Arginine, o=oeriiiCy 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=linknown, *==Stop codon, /^possible nucleoHde deletion, 
\=»possibIe nucleotide insertion 










DPRIYKLCLEQLGLQPSESIFLDDLGTNLKEAARL 

GmXIKVNDPETAVKELEALLGFTLRVGVPNTRP 

VKKTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGQSNPITYIRLA>nUDLVLRKKPPGTLLPSAHAI 

EREFRIMKALANAGVPVPNVLDLCEDSSVIGTPF 

YVMEYCPGLIYKDPSLPGLEPSHRRAIYTAMNTV 

LCKIHSVDLQAVGLEDYGKQGSTTWV/YSSRRA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGIPAAEEYFRMYCLQMGL 

PPTENWNFYMAFSFFRVAAILQGVYKRSLTGQA 

MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 
ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 
QRVYPAEPELQSHQASAARWSPSPLIEDLKVKQP 
W*GGRSGRTSWRLLALGCHT 


3413 


A 


105 


1573 


PESRHQCFSDRSSHFLTMEMEQEKMTMNKELSP 

DAAAYCCSACHGDETWSYNHPIRGRAKSRSLSA 

SPALGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

PQTIMHIQDPASQRLTWNKSPKSVLVIKKMRDAS 

LLQPFKELCTHLMEENMIVYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDDISNQIDFIICLGGDGT 

LLYASSLFQGSVPPVMAFHLGSLGFLTPFSFENFQ 

SQVTQVIEGNAAVVL/RGSRLKVRVVKELRGKK 

TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 

NEVVIDRGPSSYLSNVDVYLDGHLITTVQGD/G* 

VJa v<a1J-»*5 VV \jr ArJ-iVJlvIl IxLiJvLiOIjOVJ V 1 V O 1 f 1 VJO J. 

ayaaaagasmihpnvpaimitpicphslsfrprw 
pagvelkimlspearntawvsfdgrkrqeirhg 
dsisittscyplpsicvrdpvsdwfeslaqclhwn 
vrkkqahfeeeeeeeeeg 


3414 


A 


20 


2602 


VIVNKNVNWINYIYYNQQQRAFHELKEKLMSAL 

ALGLPDLTKPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLNIKAPHAVVTLMNTKGHHWLT 

NARLTKYQSLPCENPHmEVCNTLNPTTLLPVSE 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMDGSSFINSQGERCAGYAWTLDAVIKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLBELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLIPRFGLP 

LRIGSHNGPVFVADLDCVEINVDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPLELVITNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPVFQTFYDELNVPITEFPGKTRNLFLQLAEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

VPDEFPAQKNHPDNFWVLKASHRQYYIARVEKD 

FTLPVGRLHGG/RSNHTEKNPFSKFPKLQTV*AHP 

ESHRDWTAPTGLYWICGHRAYTKLPVASSCVIGTI 

KPSFFLLSIKTGELLGFPVYASR\KSIAIRN*NNDK 

WPPERnQYYGPAT*AQDGSWGYRIPIYMINRIIRL 

QAVLKIITATGRALTILAQQETQMRNAIYQNRLA 

LDYLLAAEGEVCRKFNLTNCCLHTONQGQVVED 

IVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 

GGFKTLIIRVirVIGTYLLLPRLLPVLLQMIKSFIAT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

IncnHnn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^'Alanine C^Cysteine, I)=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G=Glyclne, H»Histidine, 
I^Isoleucine, K=Lysine, l^Lcucine, M=Metliionine, 
N=AsDara?ine. P=ProIine. 0=Glutamine. R=Ar9inine. 53:^erine. 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UDknown, *=Stop codon, /=pos$ible nucleotide deletion, 
V=possible nucleotide insertion 










LVYQNASAQVYYINHY 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNPTPDQKREDDSG/SAA*DFKWP 
EPGKPIFQGAMVRPKTGG/CGCEGGY*CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


EFFFQRINFIEHSGSVSLLALACDLGWCEDWSCC 

LVQGGGDLVDWQTNHGEDEAGGDTDSVDEAR 

CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRKACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 


3417 


A 


243 


847 


CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGHISVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPRVKS 

EAGPLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK 


3418 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 

AELNPFGDPDSBEPITETASPRKTEDSFYNNSYNP 

FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

BLNVVQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPWFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDS\SQYVVGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 



328 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location - 
correspondiog 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=Proline, <>=Glutamine, R^Arginine, S==Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, y=Tyrosine, 
X^'Unknown, *=Stop codon, ^possible nucleotide deletion, 
V^possible nucleotide insertion 










FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITOTTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKXKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFW 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDS\SQYVVGELAALENEQKQIDTR 

AALVBKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3420 


A 


612 


1058 


ENLGPNYSHRLLHHPTFYKKIHKKHHEWTAPIG 

VISLYAHPIEHAVSNMLPVIVGPLVMGSHLSSITM 

WFSLALIITTISHCGYHLPFLPSPEFHDYHHLKFN 

QCYGVLGVLDHLHGTDTMFKQTKAYERHVLLL 

GFTPLSESIPDSPK 


3421 


A 


23 


2005 


LLTPCDGRIPGRPSVGAESGSDFQQRRRRRRDPE 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

LPVEATLAKKRKVLEFERVYLDNLPSASMYERS 

YMHRDVITHWCTKTDFIITASHDGHVKFWKKIE 

EGIEFVKHFRSHLGVIESIAVSSEGALFCSVGDDK 

AMKVFDVVNFDMINMLKLGYFPGQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNPVYKAVVSSDKSGMIEYWTGPPHE 

YKFPKNVNWEYKTDTDLYEFAKCKAYPTSVCFS 

PDGKKIATIGSDRKVRIFRFVTGKLMRVFDESLS 

MFTELQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLmiVFDETGHFS^YGTMLGIKVINVETNRCV 

RILGKQENIRVMQLALFQGIAKKHRAATTIEMKA 

SENP\a.QNlQADPTIVCTSFKKNRFYMFTKR^ 

DTKSADSDRDVFNEKPSKEEVMAATQAEGPKRV 

SDSAIIHTSMGDIHTKLFPVECPKTVENFCVHSRN 

GYYNGHTFHRIIKGFMIQTGDPTGTGMGGESIWG 

GEFEDEFHSTLRHDRPYTLSMANAGSNTNGSQFF 

ITWPTPWLDNKHTVFGRVTKGMEVVQRISNWK 

VNPKTDKPYEDVSIINITVK 


3422 


A 


2486 


433 


FVLVCAPLTWAGARHRRMAASKKPPRVRVNHQ 
DFQLRNLRIIEPNEVTHSGDTGVETDGRMPPKVT 
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S£Qm 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to iirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Add, 
E=Glutaroic Acid, F=Phenylalanine, G=G!ycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutaniine, R=Arginlne, S^Serine] 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosinc, 
X^Unknown, *^top codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










SELLRQLRQAMRNSEYVTEPIQAYIIPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTAUTEEHAAMWTD 

GRYFLQAAKQMDSNWTLMKMGLKDTPTQEDW 

LVSVLPEGSRVGVDPLIIPTDYWKKMAKVLRSA 

GHHLBPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TGISWKI)KVADLRLKMAERNVMWFVVTALDEI 

AWLFNLRGSDVEHNPVFFSYAIIGLETIMLFIDGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETIPKDHRC 

CMPYTPICIAKA\\^SA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPTISSTGPNGAIIHYAPVPETNRTLSLDE 

VYLIDSGAQYKDGTTDVTRTMHFGTPTAYEKEC 

FTYVLKGHIAVS AAVFPTGTKGHLLDSFARSAL 

w/n^HT nvT wrtTfiMrivrT^sFT MVHFnprnmYiCTF 

W J--'OvJJ-»J-' I JUnvj X vJJT.\J Y VJOr J-»l> V ITIXjVJX WVJJIO I rv 1 r 

SDEPLEAGMTVTDEPGYYEDGAFGIRIENVVLVV 
PVKTKYNFNNRGSLTFEPLTLVPIQTKMIDVDSL 
TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 
EWLIRETQPISKQH 


3423 


A 


5515 


934 


FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQILTLQQSIKQLKGQL 

NHIPSDCSANFDFSRKGLLVFTDGSITNGNVHRPS 

NNfSTVSGLFPWTPKLGNEDFNSVIQQMAQGRQIE 

YIDIERPSTGGLGFSVVALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQILAINHTPLDQNISHQQ 

AIALLQQTTGSLRLIVAREPVHTKSSTSSSLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRTIVPGGLADRDGRLQTGDHILKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR 

KDGQSLGIRIVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGHIQVNDKIVAVDGVNIQGFANHDVVEVL 

RNAGQVVHLTLVRRKTSSSTSPLEPPSDRGTVVE 

PLKPPALFLTGAVETETNVDGEDEEIKERIDTLKN 

DNIQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKIVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEILKAVPPGLVHLGICKPLVEDN 

EEESCYU.HSSSNEDKTEFSGTIHDINSSLILEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEIEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSHIQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

KTSLDLGMIPNDVQGPSLLIDLPVVAQRREQEDL 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISIV 

GGQTVIKRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKJDLEVSGVDLQNASHSEAVEAIKNAGNP 

VWIVQSLSSTPRVIPNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, • 
I=Isoleucine, K=Lysinc, L=Leucine, M=IVIethionine, 
N=Asparagine, P=ProIlne, Q^GIutamine, R^Arginine, S^SerinCj 
T«Threonine, V^Valine, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\»possibIe nucleotide insertion 










DAFTDQKIRQRYADLPGELHIIELEKDKNGLGLS 
LAGNKDRSRMSIFVVGINPEGPAAADGRMHIGD 
ELLEINNQILYGRSHQNXASAIIKTAPSKyKLVFIR 
NEDAVNQMAVTPFPVPSSSPSSIEDQSGTEPISSEE 

SFSSQEIPLAPASSYHSTDADFTGYGGFQAPLSVD 
PATCPIVPGQEMIIEISKRRSGLGLSIVGGKDTPLV 
NGVDLRNSSHEEAITALRQTPQKVRLWYRDEA 
HYRDEENLEEFPVDLQKKAGRGLGLSIVGKR 


3424 


A 


2223 


1162 


HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKP\GTFMTSHEPPVY 

IVyinPr»T^T^WQr'FHQW\/n>ITAV13n A <;r>r>F<sTPTMYR'M 
ivij^jCj^i^i^ivo^rrioriiviiN ir\ v s^un^uucoiriLvi i ivin 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFC\DS 

aSDCLHE-nDJHKGDHQLEPIYRSNETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3425 


A 


2223 


1162 


HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKPNGTFMTSHEPPVY 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 
KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFC\DS 
aSDCLHETODIHKGDHQLEPIYRSXETFLDRDYCV 
SQGTSYNYLDPNYFPANR 


3426 


A 


2 


1553 


LFVWHDDPRWGTPRYWLGALYRNQQSSPTAPP 

GLLPLEYFPAAPHCSHSRQWRCSQTHRIHHHPQ 

MLGPCRQEICGITMAAGTLYTYPENWRAFKALl 

AAQYSGAQVRVLSAPPHFHFGQTNRTPEFLRKFP 

AGKVPAFEGDDGFCVFESNAIAYYVSNEELRGST 

PEAAAQVVQWVSFADSDIVPPASTWVFPTLGIM 

HHNKQATENAKEEVRRILGLLDAYLKTRTFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRA\VFGEVKLCEKMAQF\DAKK 

FAETQPKKDTPRKEKGSREEKQKPQAERKEEKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

SLWYSEYRFPEELTQTFMSCNLITGMFQRLDKLR 
KNAFASVILFGTNNSSSISGVWVFRGQELAFPLSP 
DWQVDYESYTWRKLDPGSEETQTLVREYFSWE 
GAFQHVGKAFNQGKIFK 


3427 


A 


755 


52 


TAARRRQKGTAARRRQKGTAARRRQKGTAARR 

RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 

AARRRQKGTAARRRQKGTAARRRQKGTAARRR 

OKGLSNLDAAEWLPPKKG\GEKKKGPFLAINEV 

VmEYPINILKRfflGVGFKKRAPRALKEIRKFAM 

KEMGTPDVRIDTRLNKAVWAKGBINVPYRIRVR 

LSRKRNEDEDSPNKLYTLVTYWVTTFKNLQTV 

NVDEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine CK^ysteine, D=>Aspartic Acid, 
£=Glutamic Acid» F=PhenylaIanine, G'^GIycine, H^Histidine, 
I=IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=AsparagiDe, r^rroiinc, v^oiutamine, KBArguoiDCf o^oenn^ 
T=Threoninc V=Valine, W=TryptophaD, Y^Tyrosine, 
X^UnknowD, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSGITIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETGIAGLHLPNGGVEGAVLGKGGKPQ 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 
TNVWVALYKNNVPATYTYDEYKKGYLDQASG 
GAVLQLRPNDQVWVQMPSDQANGLYSTEYIHSS 
FSGFLLCPT 


3429 


A 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 
AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTAYQEQRPQVEQVGKQAPLSPGLPAMG 
GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TV/ AT P APPnATiT QQT P AT T HO AT PHnXAHT nOT ^ 

YLAPGEDGHWVPIPEEESLQRAWQDAAACPRGL 
QLQCRGAGiGRPVLYQWAQHSYSAQGPEDLGF 
RQGDTVD VLCEVDQA WLEGHCDGRIGIFPKCFV 
VPAGPRMSGAPGRLPRSQQGDQP 


3430 


A 


799 


1989 


rNKYINIRKKDCLLSPLPPLWSHLALLQASATKWV 

LTPAAFAGKLLSVFRQPLSSLWRSLVPLFCWLRA 

TFWLLATKRRKQQLVLRGPDETKEEEEDPPLPTT 

PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA 

KRGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 

LVRFCKVELRLPSVSIWSNGSLIRERWFQNYGVE 

YLDILAISCDSFDEEVNCP\IGRGN\GKKNHVENL 

KALNPVRWKVFQCLLBGENCGEDA\LREAERFV 
IGDEEFERFLERHKEVSCLVPESNQKMKDSYLIL 
DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 
DEKMFLKRGGKYIWSKADLKLDW 


3431 


A 


5468 


2146 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNVWRVSHINSNYKLCPSYP 

QKLLVPVWITDKELENVASFRSWKRDPVWYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKLLILDARSYTAAVANRAK 

GGGCECEEYYPNCEWFMGMANIHAIRNSFQYL 

RAVCSQMPDPSNWLSALESTKWLQHLSVMLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPOIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEFNEAFLVKLVQHTYSCLYGTFLANNPC\EREK 

RNIYK/RGTCSVWALLRAGNKNFHNFLYTPSSD 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lUvUllUll 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glntamic Acid, F=Phenylalaninc, G=Glycine, H=Histidine, 
I=Isoleucinc, K«Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P^Proliney Q=Glutaniine) R^^Arginine^ SsnSerine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X'^Unknown, *==Stop codon, /^possible nucleotide deletion, 
V'possible nucleotide insertion 










MVLHPVCHVRALHLWTAVYLPASSPCTLGEEN 
MDLYLSPVAQSQEFSGRSLDRLPKTRSMDDLLS 
ACDTSSPLTRTSSDPNLNNHCQEVRVGLEPWHS 
NPEGSETSFVDSGVGGPQQTVGEVGLPPPLPSSQ 
KDYLSNKPFKSHKSCSPSYKLLNTAVPREMKSNT 
SDPEIKVLEETKGPAPDPSAQDELGRTLDGIGEPP 
EHCPETEAVSALSKVISNKCDGVCNFPESSQNSPT 
GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 
ACQTPLDPSTDRLNQDPSGSVASISHQEQLSSVP 
DLTHGEEDIGKRGNNRNGQLLENPRFGKMPLEL 
VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 
TPRRLLSYGCCSKRPNSKQMRATGPCFGGQWAQ 
REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 
. QVSSTKPVPLNCPSPVPPLYLDDDGLPFPTDVIQH 
RLRQIEAGYKQEVEQLRRQVRELQMRLDIRHCC 
APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 
DCLSEASWEPVDKKETEVTRWVPDHMASHCYN 
CDCEFWLAKRRHHCRNCGNVFCAGCCHLKLPIP 
DQQLYDPVLVCNSCYEHIQVSRARELMSQQLKK 
PIATASS 


3432 


A 


36 


1873 


MTFFSSVADFIGLDPRIAAWLIDPSDATPSFEDLV 

EKYCEKSITVKVNSTYGNSSRNIVNQNVRENLKT 

LYRLTMDLCSKLKDYGLWQLFRTLELPLIPILAV 

MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 

VAGERFLITSNNQLREILFGKLKLHLLSQRNSLPR 

TGLQKYPSTVSEALNALRDLHPLPKIILEYRQVH 

KIKSTFVDGLLACMKKGSJSSTWNQTGTVTGRLS 

AKIIPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 

MFVSSKGHTFLAADFSQIELRILTHLSGDPELLKL 

FQESERDDVFSTLTSQWKDVPVEQVTHADREQT 

KKWYAWYGAGKERLAACLGVPIQEAAQFLES 

FLQKYKKIKDFARAAIAQCHQTGCWSIMGRRR 

PLPRIHAHDQQLRAQAERQAVNFWQGSAADLC 

KLAMIHVFTAVAASHTLTARLVAQIHDELLFEVE 

DPQIPECAALVRRTMESLEQVPLKVSLSAGRSWG 

HLVPLQEAWXALRQAHVALSLPATAWLPLGPLP 

APSPHPCIFRLHFVCSPRQQWEERTGFQQSIVWPS 

PRSPALYAPGRINPLGLGWPAIPWSKCLCKALKK 

K 


3433 


A 


1481 


476 


IPPKERAPGIRASCLAITAGARPTSYGRVGCEGDV 

RLSPVSPLL APPDPRLASRWEGRSRMKGKKGIVA 

ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP 

KESVQILRDWLYEHRYNAYPSEQEKALLSQQTH 

LSTLQVCNWFINARRRLLPDMLRKDGKDPNQFTI 

SRRGAKISETSSVESVMGIKNFMPALEETPFHSFT\ 

AGPNPTLG\RPLSAKP/SQSPGSVLARPSVICHTTV 

TAIERLSLSLSCQSVGCGQNT\DIQQIA1\RNLRDS 

SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 

DFSGFQLLVDVALKRAAEMELQAKLTA 


3434 


A 


1720 


1243 


NGPVPPGGSKTKWAGGSAAEGSPRLSPSPGAAQ 

VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 

PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 

IGRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 

LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 

RTVQLNVCSSEEVEKVA^GDCPLEPEGPNEKGMW 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^Alanine C^ysteine, I>==Aspartic Add, 
E^Glutamic Acid, F=PbenylaIanine, G=Glycine, ll=Histidine, 
I=Isoleucine, K=Lysine, ]>Leucine, M=Methionine, 
N=Asparagine, P=Proline, (^Glutamine, R^Aiiginine, S^Serine* 
T=Threoninc, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *^Stop codon, /possible nucleotide deletion, 
V^possible nucleotide insertion 










GLVNNAGISTFGEVEFTSLETYKQVAEVNLWGT 

VmTKSFLPLIRRAKGRVVNISSMLGRMANPAR 

Ox I v^i 1 jvr \j V jj«/\r oLjy^Ltis. I Hivi i rLt\j v jvy o v v jdjt vj 

NFIAATSLYSPESIQAIAKKMWEELPEWRKDYG 

KKYFDEKIAKMETYCSSGSTDTSPVIDAVTHALT 

ATTPYTRYHPMDYYWWLRMQIMTHLPGAISDM 

lYIR 


3435 


A 


842 


3595 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 

LQKLKERVEAQENKLBaCIRAMRGQVDYSKIMN 

GNLSAEIERFSAMFQEKKQEVQTAILRVDQLSQQ 

LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 

LQIRNQLNQEQNSKLQQQKELLNKRNMEVAMM 

DKRISELRERLYGKKIQACEKVFLNRVNGTSSPQ 

SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 

LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 

VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 

PTEKPGIEIGKVPPPPGVGKQLPPSYGTYPSPTPL 

GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 

GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 

SKPELPLTVAIRPFLADKGSRPQSPRKGPQTVNSS 

SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 

KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 

KEPEQDGPAAPADGSTVESLPRPLSPTKLTPIVHS 

PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 

GPGGPNIQKLLYQRFNTLAGGMEGTPFYQPSPSQ 

DFMVTLADVDNGNTNANGNLEELPPAQPTAPLP 

AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 

EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 

PLPPASHPPATSTNKRTNLKKPNSERTGHGLRVR 

FNPLALLLDASLEGEFDLVQRIIYEVEDPSKPNDE 

KJi 1 r JLrliN A V wXltlJtli V JSJr JL/JuJL>Jr U V IN V IN AAJJoJU 

GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 

lETAADKCEEMEEGYIQCSQFLYGVQEKLGVMN 
KGVAYALWDYEAQNSDELSFHEGDALTILRJIXD 
E 


3436 


A 


3 


2604 


GSTHASEKMKTGRSALVVTDTGDMSVLNSPRHQ 

SCIMHVDMDCFFVSVGIRNRPDLKGKPVAVTSN 

RGTGRAPLRPGANPQLEWQYYQNKILKGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQLCPNLQAVPYDFHAYKEVA 

QTLYETLAS\YTHNIEAVSCDEALVDITEILAETK 

LTPDEFANAVRMEIKDQTKCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKLASLGIKTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRRLEATGMKG 

KRLTLKIMVRKPGAPVETAKFGGHGICDNIARTV 

TLDQATDNAKIIGKAMLNMFHTMKLNISDMRGV 

GIHVNQLVPTNLNPSTCPSRPSVQSSHFPSGSYSV 

RDVFQVQKAKKSTEEEHKEVFRAAVDLEISSASR 

TCTFLPPFPATTLPTSPDTNK A E55J5GK WNGLHTPV 

SVQSRLNLSIEVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 



334 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to iirst amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

rnrffcnnnHino 

L.111 1 v3|IUIlUI[lg 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalamne, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=A«n!iriiairip PsPrnlin^ 0=:f?liiti)minf R=Ar?inine. S=^erine. 
T=Threonine, V=VaUne, W^Tryptophan, Y=Tyrosinc, 
X=l)n known, *'=Stop codon, /=possible nucleotide deletion^ 
\-possible nucleotide insertion 










PH^LLHLKAAVKEKKIWKKK^ 

N^OCLLNSPAKTLPGACGSPQKLIDGFLKHEGPPA 

EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 

AGAVEFNDVKTLLREWITTISDPMEEDILQVVKY 

CTDLffiEKDLEKLDLVIKYMKRLMQQSVESVWN 

MAFDFILDNVQWLQQTYGSTLKVT 


3437 


A 


32 


4038 


SLLRLLKAQWGSSGAASEPVVLGEEGCGFPSTNE 

YPDLEEERATYPQEEDRFLTPGI^QLLWSPWSPL 

DQEEACASRQLHSLASFSTVTARRNPLHNPWGM 

ELAASENTDSPSPRPLRPGVTLPPGALTMNTKDT 

TEVAENSHHLKIFLPKKLLECLPRCPLLPPERLRW 

NTNEEIASYLITFEKHDEWLSCAPKTRPQNGSIIL 

YNRKKVKYRKDGYLWKKRKIX3KTTREDHMKL 

KVQGMECLYGCYVHSSIVPTFHRRCYWLLQNPD 

IVLVHYLNVPALEDCGKGCSPIFCSISSDRREWLK 

WSREELLGQLBCPMFHGIKWSCGNGTEEFSVEHL 

VQQILDTHPTKPAPRTHACLCSGGLGSGSLTHKC 

SSTKHRIISPKVEPRALTLTSIPHPHPPEPPPLIAPLP 

PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 

SSRGGTAILLLTGLEQRAGGLTPTRHLAPQADPR 

PSMSLAWVGTEPSAPPAPPSPAFDPDRFLNSPQR 

GQTYGGGQGVSPDFPEAEAAHTPCSALEPAAAL 

EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 

KGHGAAPPIPSPPPSPPPSPAPLEPSSRVGRGEALF 

GGPVGASELEPFSLSSFPDLMGELISDEAPSIPAPT 

PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 

AEHYSCVFDHIAVPASLVQPGVLRCYCPAHEVG 

LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 

DWLSLDDNQFRMSELERLEQMEKRMAEIAAAGQ 

VPCQGPDAPPVQDEGQGPGFEARVVVLVESMIP 

RSTWKGPERLAHGSPFRGMSLLHLAAAQGYARL 

lETLSQWRSVETGSLDLEQEVDPLNVDHFSCTPL 

MWACALGHLEAAVLLFRWNRQALSIPDSLGRLP 

LSVAHSRGHVRLARCLEELQRQEPSVEPPFALSP 

PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 

PPPAPLPASEMTMEDMAPGQLSSGVPEAPLLLM 

DYEATNSKGPLSSLPALPPASDDGAAPEDADSPQ 

AVDVIPVDMISLAKQIIEATPERIKREDFVGLPEA 

GASMRERTGAVGLSETMSWLASYL\ENVDHFPS 

STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 

IGKLIFALLTL\SD\QEQRELYEAARVIQTAFRKYK 

GRRLKEQQEVAAAVIQRCYRKYKQLTWIALKFA 

LYKKMTQAAILIQSKFRSYYEQKRFQQSRRAAV 

LIQQHYRSYRRRPGPPHRTSATLPARNKGSFLTK 

KQDQAARKIMRFLRRCRHRMRELKQNQELEGLP 

QPGLAT 


3438 


A 


469 


2602 


FGRLLWGTAFKSWKMKAPBPHLILLYATFTQSLK 

VVTKRGSADGCTDWSroiKKYQVLVGEPVRIKC 

ALFYGYIRTNYSLAQSAGLSLMWYKSSGPGDFE 

EPIAFDGSRMSKEEDSIWFRPTLLQDSGLYACVIR 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDEEDFLLPTREPEILWYKECRTKT 

WRPSIVFKRDTLLIREVREDDIGNYTCELKYGGF 

VVRRTTELTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKGEKFIE 
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SEQID 
NO: 


Method 


Predicted 

beginDing 

nucleotide 

location 

correspondiDg 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D^Aspartic Add, 
E^GIutamic Acid, F=Fbenylalanioe, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P'^ProIine, Q=Glutamine, R^Arginine, S«=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nudeotide deletion, 
Nppossible nucleotide insertion 










DLDENRVWESDIVKILKEHLGEQEVSISLIVDSVEE 

GDLGNYSCYVENGNGRRHASVLLHKRELMYTV 

ELAGGLGAILLLLVCLVTIYKCYKIEIMLFYRNHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGTYI 

EDVARCVDQSKRLIIVMTPNYVVRRGWSIFELET 

RLRl^TMLVTGEIKVILIECSELRGIMNYQEVEALK 

rl 1 JjULi/l VIK WHUJrIvCINJtLINoJsj* WIslOjVi iilJVIrr 

KRIEPITHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTFHNTYHSQMRQBCHYYRSYE 

YDVPPTGTLPLTSIGNQHTYCNIPMTLINGQRPQT 

KSSREQNPDEAHTNSAILPLLPRETSISSVIW 


3439 


A 


251 


2037 


GPGNSSILIGGGHLFLIRSCLNLLLLNSKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFTERGSSLSVMIWPLALSLLRHGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

KYIQFKTTVCGITiaRPDFSETGQWDWTETEGKQ 

NRAVFDAVMVCTGHFLNPHLPLEAFPGIHKFKG 

QILHSQEYKIPEGFQGKRVLVIGLGNTGGDIAVEL 

SRTAAQVLLSTRTGTWVLGRSSDWGYPYNMMV 

TRRCCSFIAQVLPSRFLNWIQERKLNKRFNHEDY 

GLSITKGKKAKFIVNDELPNCILCGAITMKtSVIE 

FTETSAVFEDGTVEENIDVVIFTTGYTFSFPFFEEP 

LKSLCTKKIFLYKQVFPLNLERATLAnGLIGLKGS 

ILSGTELQARWVTRVFKGLCKRPASQKLMMEAT 

EKcljLlKKu Vr JsJJ 1 oKiJKr D Y 1 A Y MUDlAAClu 1 

KPSIPLLFLKDPRLAWEVFFGPCTPYQYRNLMGPG 

KWDGARNAILTQWDRTLKPLKTRIVPDSSKAWP 

SM\SHYLKAWGAPVLLASLLLICK\SSLFLKLVRD 

KLQDRMSPYLVSLWRG 


3440 


A 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVM 

ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 

IQFARANQAIQMACQNLVDPGSSPSQVLSAATIV 

AKHTSALCNACRIASSKTANPVAKRHFVQSAKE 

VANSTANLVKHKALDGDFSEDNKNKCRIATAPL 

lEAVENLTAFASNPEFVSIPAQISSEGSQAQEPELV 

SAKPMLESSSYLIRTARSLAINPKDPPTWSVLAG 

HSHTVSDSnCSLITSmDKAPGQRECDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSVV 

QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 

LILAAVGVASKILDHQQQMTVLDQTKTLAESAL 

QMLYAAKEGGGNPKAQHTHDAITEAAQLMKEA 

VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTVVKYSKAIAVTAQEM 

MTKSVTNPEELGGLASQMTSDYGHLAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAG\AL 

QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 

NKGTQACITAATAVSGIIADLDTTIMFATAGTLN 

AENSETFADHRENBLKTAKALVEDTKLLVSGAAS 

DPETQVVLINAIKDVAKALSDLISATKGAASKPV 
DDPSMYQLKGAAKVMVTNVTSLLKTVKAVEDE 
ATRGTRALEATIECDCQELTVFQSKDVPEKTSSPE 
ESIRMTKGITMATAKAVAAGNSCRQEDVIATAN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

lULuiion 

corresponding 

to First amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, l>=Aspartic Acid, 
E=GIutamic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Lcucine, M=Methionine, 

T=Threonine, V=Valine, W=Tryptophaii, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=po$sible nucleotide insertion 










LSRKAVSDMLTACKQASFHPDVSDEVRTRALRF 

GTECTLGYLDLLEHVLVILQKPTPELKQQLAAFS 

KRVAGAVTELIQAAEAMKGTEWVDPEDPTVIAE 

TELLGAAASffiAAAKKLEQLKPRAKPKQADETL 

DFEEQILEAAKSIAAATSALVKSASAAQRELVAQ 

GKVGSIPANAADDGQWSQGLISAARMVAAATSS 

LCEAANASVOGHASEEKLISSAKOVAASTAOLL 

VACKVKADQDSEAMRRLQAAGNAVKRASDNL 

VRAAQKAAFGKADDDDVVVKTKFVGGIAQIIAA 

QEEMLKKERELEEARKKLAQIRQQQYKFLPTEL 

REDEG 


3441 


A 


3 


1584. 


NSARGGVGVRGARAMATVQEKAAALNLSALHS 

PAHRPPGFSVAQKPFGATYVWSSUNTLQTQVEV 

KKRRHRLKRHNDCFVGSEAVDVIFSHLIQNKYF 

GDVDIPRAKVVRVCQALMDYKVFEAVPTKVFG 

KDKKPTFEDSSCSLYRFTTIPNQDSQLGKENKLY 

SPARYADALFKSSDIRSASLEDLWENLSLKPANS 

PHVNISTTLSPQVINEVWQEETIGRLLQLVDLPLL 

DSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGIL 

KAYSDSQEDEWLSAAIDCLEYLPDQMVVEISRSF 

PEQPDRTDLVKELLFDAIGRYYSSREPLLNHLSD 

VHNGIAELLVNGKTEIALEATQLLLKLLDFQNRE 

FFRRT I YFMAVAANPSFFKT OKFSDNTRMVVKRT 

FSKAIVDNKNLSKGKTDLLVLFL\MDHQKDVFKI 

PGTL\HKIVS\VK\L1V1AIQNGRDPNRDAGYIYCQR1 

DQRDYSNITEKTTIDELLYLLKTLDEDSKLSAKE 

I<KK\LLGQFYKCHPDIFIEHFGD 


3442 


A 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAAQQ 
VAEDKFVFDLPDYESINHVWFMLGTIPFPEGMG 
GSVYFS YPDSNGMPVWOLLGFVTNGK PS A TFKT^ 

GLKSGEGSQHPFGAMNIVRTPSVAQIGISVELLDS 
MAQQTPVGNAAVSSVDSFTQFTQKMLDNFYNF 
ASSFAVSQA'PDDTQ/RPSEMFIPANVVLKWYENF 
QRRTSTEPSLLENUWIKINF 


3443 


A 


3 


1373 


SWHVRRRWLEATMAGGMKVAVSPAVGPGPWG 

SGVGGGGTVRLLLELSGCLVYGTAETDVNWML 

QESQVCEKRASQQFCYTNVLIPQWHDIWTRIQIR 

VNSSRLVRVTQVENEEKLKELEQFSIWNFFSSFL 

KEKLNDTY\fTvrVGLYSTKTCLKVEIIEKDTKYSVI 

VIRRFDPKLFLVFLLGLMLFFCGDLLSRSQIFYYS 

TGMTVGIVASL\LfflFILSKFMPKKSPrYVILVGGW 

SFSLYLIQLVFKNLQEIWRCYWQYLLSYVLTVGF 

MSFAVCYKYGPLENERSINLLTWTLQLMGLCFM 

YSGIOIPHIALAIIIIALCTKNLEHPIOWLYITCRKV 

CKGAEKPVPPRLLTEEEYRIQGEVETRKALEELR 

EFCNSPDCSAWKTVSRIQSPKRFADFVEGSSHLT 

PNEVSVHEQEYGLGSIIAQDEIYEEASSEEEDSYS 

RCPAITQNNFLT 


3444 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPGVGSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKJFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
JE^Glutaraic Acid, F^Phenylalanine, G^Glycine, H^Histidine, 
I=l5oIeucine, K=Lysine, L^Leucine, M=Mcthionine, 
fN^Asparagine, m roiine, v=vviuiamine) xv—Argininej a— oerincj 
T=Threonine, V=Vaiine, W=Tryptophan, Y^Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide Insertion 










DETLYKAWSSIVYOLIPNVOOLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNl 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFHCLERV 

DGPKQCLLMR 


3445 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 
EPGVGSLGWVLPNTAMKKKVLLMGKSGSGKTS 
MRSDFANYIARDTRRLGATILDRIHSLQINSSLST 
YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DFTLYKAWJiSIVYOT TPNVOOLEMNLRNFAEUE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFrSN 

TYViVEVVMSDPSIPSAATLmilWARKHFEKLERV 

DGPKQCLLMR 


3446 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPGNGSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSHFANYL^RDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

HFTT VRTAWR^TVYOl TPNVOOT FM>JT RNFAFTTF 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 
IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 
TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 
DGPKQCLLMR 


3447 


A 


1 


2930 


VLLGPLWDKLSTADHPVIVTMASKRKSTTPCMIP 

VKTVVLQDASMEAQPAETLPEGPQQDLPPEASA 

ASSEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHMNSEHTDFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVAVNVAKPD 

NHVVVEQSIPESTSTPDLAGEPSAEGADGQAEinX 

KTPIMKIMKGKAEAKKIHTLKENVPSQPVGEALP 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGL\QFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFHKFPYPTKAELCYLTWTKYPEEQLKIW 

FTAQRLKQGISWSPEEIEDARKKMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHVVGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKVVNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAMIPGDHRSIIIDSVPEVSFSPSSKVPEVTCIPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

LRALESSFAQNPLPLDEELDRLRSETKMTRREIDS 

WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSHILAERKVSPIK 

INLKNLRVTEANGRNEIPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSIMAQTGLPRPEVVRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
currcopouui ng 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C~Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F=PhenyIaIanine, G^GIycine, H^^Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M^Methionine, 

T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=^top codon, A=possibIe nucleotide deletion, 
\=possible nucleotide insertion 










MLYEEDLQNLCDKTQMSSQQVKQWFAEKMGEE 

TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 

VSENSESWEPRVPEASSEPFDVTSSPQAGRQLETD 


3448 


A 


2 


1324 ■ 


FVARAEKGFRTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKIRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTKRSITFLCGDAGPLAVAAVLYHKMN 

NEKQAEDCITRLIHLNKIDPHAPNEMLYGRIGYIY 

ALLFVNKOTGVEKIPQSfflQQICETILTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LMQPSLQVSQGKLHSLVKPSVDYVCQLKFPSGN 

YPPrTfiDNRnLLVHWrHGAPGVTYMT lOAYKVF 

R/EREKYLC\DA YQC ADVI WQ YGLLKKG YGLC Y\ 

GSAGNAYAFLTLYNLTQDMKYLYRACKFAEWC 

LEYGEHGCRTPDTPFSLFEGMAGTIYFLVADLLFP 

TKAR\FPAFEL 


3449 


A 


3 


2389 


SRHVTGAARSPSRAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLVRKAGGHDAGKLYAMKVLRKAALVQRAK 

TQEHTRTERSVLELVRQAPFLVTLHYAFQTDAKL 

HLBLDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

IVLALEHLHKLGIIYRDLKLENVLLDSEGHIVLTD 

FGLSKEFLTBEKERTFSFCGTIEYMAPEIIRSKTGH 

GKAVDWWSLGILLFELLTGASPFTLEGERNTQAE 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

FRPQIRSELDVG\NFAEEFTRLEPVYSPPGQ\PPPG 

DPRIFQGYSFVAPSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPNVVNLHEVHHDQLHTYLVLEL 

LRGGELLEHIRKKRHFSESEASQILRSLVSAVSFM 

HEEAGWHRDLKPENILYADDTPGAPVKUDFG/F 

SPRLRPQSPGVPMQTPSFTLQYAAPELLAQQGYD 

ESCDLWSLGVILY\MMLSGQAPFQGASGQGGQS 

OAAFTMCKTRFGRFST DGFAWOrrVSFFAKFT VR 

GLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAVRSGLNATFMAFNRGKREGFFLK 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR 

APVASKGAPRRANGPLPPS 


3450 


A 


201 


1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

VPELPGFYFDPEKKRYFRLLPGHNNCNPLTKESIR 

QKEMESKRLRLLQEEDRRKKIARMGFNASSMLR 

KSQLGFLNVTNYCHLAHELRLSCMERKKVQIRS 

MDPSALASDRFNLILADTNSDRLFTVNDVTVGGS 

KYGIINLQSLKTPTLKVFMHENLYFTNRKVWSV 

CWASLNHLDSHELLCLMGLAETPGCATLLPASLF 

VNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQA 

NNCFSTGLSRRVLLTlvrVVTGHRQSFGTNSDVLA 

QQFALMAPLLFNGCRSGEIFAIDLRCGNQGKGW 

KATRLFHDSAVTSVRILQDEQYLMASDMAGKIK 

LWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGI 

LVAVGQDCYTRIWSLHDARLLRTIPSPYPASKAD 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

cnrrt^v nn n H i np 

to first amino . 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, INAspartic Acid, 
f>=Glutamic Acid, F=Phenyla]anine, G^Glycine, H=Histidine, 
]=]soleucine, K=Lysine, L^Leucine, M^Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S»5erine, 
T=Thrconine. V=Valinc, W=TrvDtonhan. Y^Tvrosinc 
X^Unknown, *=Stop codon, /^possible nudeotide deletion, 
V°possible nucleotide insertion 










IPSVAFSSRLGGSRGAPGLLMAVGQDLYCYSYS 


3451 


A 


19 


6033 


LLSAMLSHGAGLALWITLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRILGSPCNFSLIYSSDTLGA 

ALCPTFRIDNTTyGCNLQDLQAGTIYNFKIISLDE 

ERTWLQTDPLPPARPGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILVHGGWDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVSNLKVTND 

GSLTSLKVKWQRPPG\NVDSYNITLSHKGT1KESR 

VLAPWIT\ETHFKELVPGRLY\QVTCSAVSLGELS 

AQKM\AVGRTFPDKVANLEANNNGRMRSLWS 

WSPPAGDWEQYRILLFNDSWLLNTTVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGNLKNSERCQG 

RTVPLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

NESISSETSRYSFHSLKSGSLYSWVTTVSGGISSR 

QVVVEGRTVPSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDNYEVTLSHDGKWQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 

NKNNFIQTKSIPKSENECVFVQLVPGRLYSVTVT 

TKSGQYEANEQGNGRHPEPVKDLTLRNRSTEDL 

HVTWSGANGDVDQYEIQLLFNDMKVFPPFHLVN 

TATEYRFTSLTPGRQYKILVLTISGDVQQSAFIEG 

FTVPSAVKNimSPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTIPKHVFEHTFHRLEAGEQ 

YQIMIASVSGSLKNQINVVGRTVPASVQGVIADN 

AYSSYSLrVSWQKAAGVAERYDBLLLTENGILLR 

NTSEPATTKQHKFBDLTPGKKYKIQILTVSGGLFS 

KEAQTEGRTVPAAVTDLRITENSTRHLSFRWTAS 

EGELSWYNIFLYNPDGNLQERAQVDPLVQSFSFQ 

NLLQGRMYKMVIVTHSGELSNESFIFGRTVPASV 

SHLRGSNRNTTDSLWFNWSPASGDFDFYELILYN 

PNGTKKENWKDKDLTEWRFQGLVPGRKYVLW 

WTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 

SLAITWKGPPDWTDYNDFELQWLPRDALTVFNP 

YNNRKSEGRTVYGLRPGRSYQFNVKTVSGDSWK 

TYSKPIFGSVRTKPDKIQNLHCRPQNSTAIACSWl 

PPDSDFDGYSIECRKMDTQEVEFSRKLEKEKSLL 

NIMMLVPHKRYLVSnCVQSAGMTSEVVEDSTIT 

MIDRPPPPPPHIRVNEKDVLISKSSINFTVNCSWFS 

DTNGAVKYFTWVREADGSDELKPEQQHPLPSY 

LEYRHNASIRVYQTNYFASKCAENPNSNSKSFNI 

KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 

SIRAFTQLFDEDLKEFTKPLYSDTFFSLPITTESEP 

LFGAIEGVSAGLFLIGMLVAWALLICRQKVSHG 

INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 
SCDIALLPENRGKNRYNNBLPYDATRVKLSNVDD 
DPCSDYINASYPGNNFRREYIVTQGPLPGTKDDF 
WKMVWEQNVHNTVMVTQCVEKGRVKCDHYW 
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SEQID 
NO! 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C-Cysteine, D=Aspartic Add* 
E=Glutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methiooine, 
iy=Asparagine, r—r roiine, y=oiuiamine, K— Arginine, 5— oenne, 
T=Threonine, V^Vallne, W=Tryptophan, y=Tyroslne, 
X=Unknown, *«Stop codon, /^^^possible nucleotide deletion, 
\ppossible nucleotide insertion 










PADQDSLYYGDLILQMLSESVLPEWnREFKICGE 

POT DAI-TRT TRHFHYTVWPDHGVPFTTOST TOFVR 

TVRDYINRSPGAGPTVVHCSAGVGRTGTFIALDR 

ILQQLDSKDSVDIYGAV\HDLRLHRVHMVQTEC 

QYVYLHQCVRDVLRARKLRSEQENPLFPIYENV 

NPEYHRDPVYSRH 


3452 


A 


63 


1073 


FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEGKS/ 

ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAKSITKKCEKRSSSWKETELVWDTPGIFDTE 

VPNAETSKEIIRCILLTSPGPHALLLWPLGRYTEE 

EHKATEKILKMFGERARSFMILIFTRKDDLGDTN 

EQEAQRAQLLGLIQRVVRENKEGCYTNRMYQR 
AEEEIQKQTQAMQELHRVELEREKARIREEYEEK 
IRKLEDKVBQEKRKKQMEKKLAEQEAHYAVRQ 
QRARTEVESKDGILELIMTALQIASFILLRLFAED 


3453 


A 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGIJUTTLVQA 

DKKVDCPRLCTCEIRPWFTPRSIYMEASTVDCND 

LGLLTFPARLPANTQILLLQTNNIAKIEYSTDFPV 

NLTGLDLSQIWLSSVTNINGKKMPQLLSVYLEEN 

KLTELPEKCLSELSNLQELYINHNLLSTISPGAFIG 

LHNLLRLHLNSNRLQMINSKWFDALPNLEILMIG 

ENPIIRIKDMNFKPLINLRSLVIAGINLTEIPDNAL 

VGLENLESISFYDNRLIKVPHVALQKWNLKFLD 

LNKOTINRJRRGDFSNMLHLKELGINNMPELISID 

SLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKL 

ESLMLNSNALSALYHGTIESLPNLKEISfflSNPIRC 

DCVIRWMNMNKTNIRFMEPDSLFCVDPPEFQGQ 

NVRQVHFRDMMEICLPLIAPESFPSNLNVEAGSY 

VSFHCRATA\EPQPEIYWITPSGQKLLPN'RLTDKF 

YVHSEGTLDINGVTPKEGGLYTCIATNLVGADLK 

SVMIKVDGSFPQDNNGSLNIKIRDIQANSVLVSW 

V A Q WTT T<r Q<sVl^WT A FVK'TF>J^T4 A A A T? TP^HV 
iS^r\ooIVi.J-/J\.oO V IV W 1 I\r V Iv 1 XjIN orxr\/\\^O^lVLr Ol-' V 

KVYNLTHLNPSTEYKICIDIPTIYQKNRKKCVNVT 
TKGLHPDQKEYEKNNTTTLMACLGGLLGnGVIG 

LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 
PLINLWEAGKEKSTSLKVKATVIGLPTNMS 


3454 


A 


1844 


244 


ERYLFATYVAPSATLDIGLQQEKKKEIYMKIQPP 

FEDLFDTAEEYILLLLLEPWTKMVKSDQIAYKKV 

ELVEETRQLDSTYFRKLQALHKETFSKKAEDTTC 

EIGTGILSLSNVSKRTEYWDNVPAEYKHFKFSDL 

LNNKLEFEHFRQFLETHSSSMDLMCWTDIEQFRR 

ITYRDRNQRKAKSIYIKNKYLNKKYFFGPNSPAS 

LYQQNQVMHLSGGWGKILHEQLDAPVLVEIQK 

HVQNRLENVWLPLFLASEQFAARQKIKVQMKDI 

AEELLLQKAEKKIGVWKPVESKWISSSCKIIAFRK 

ALLNPVTSRQFQRFVALKGDLLENGLLFWQEVQ 

KYKDLCHSHCDESVIQKKITTIINCFINSSIPPALQI 

KFWPQFCEFRKNLTDENlMSVLERRQEYlsIKQKK 
KLAVL/ONDEKSGKDGIKOYANTSVPAIKTALLS 
DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIQE 
ELEK\SCLQACNLSQILRLALQLCL 


3455 


A 


228 


3330 


APTAQAMMSFGGADALLGAPFAPLHGGGSLHY 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVS 
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SEOIO 
NO: 


Mctbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine C=Cysteine, D-Aspartic Acid, 
E^'Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
IN— Asp&raginc} roiine, ^^oiuianiiiici k— Argininc« o—oennct 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nncleotide deletion, 
V=possible nucleotide insertion 










ASPSRFRGAGAASSTDSLDTLSNGPEGCMVAVA 

TSRSEKEQLQALNDRFAGYIDKVRQLEAHNRSLE 

GEAAALRQQQAGRSAMGELYEREVREMRGAVL 

RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ 

REEAEAAARALARFAQEAEAARVDLQKKAQAL 

QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 

QAETRDALKCDVTSALREIRAQLEGHAVQSTLQ 

SEEWFRVRLDRLSEAAKVNTDAMRSAQEEITEY 

RRQLQARTIELEALKSTKDSLERQRSELEDRHQA 

DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 

LNVKMALDIEIAAYRKLLEGEECRIGFGPIPFSLP 

EGLPKIPSVSTHIKVKSEEKIKWEKSEKETVIVEE 

QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 

AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 

AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 

PEVAKSPEKDGKQNFQAEVKSPEKAKSPAKEEAK 

SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 

SPAEVKSPEKAKSPTKEE\AKSPEKAKSPEKAKSP 

EKEEAKSPEKAKSPVKAEAKSPEKAKSPVKAEA 

KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 

KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 

KTLDVKSPEAKTPAKEEARSPADKFPEKAKSPVK 

EEVKSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 

KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 

EAKKEEAEDKKKVPTPEKEAPAKVEVKEDAKPK 
EKTEVAKKEPDDAKAKEPSKPAEKKEAAPEKKD 
TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 
KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 


3456 


A 


258 


1463 


YLSFIPGHASKSAPMNGHCFAENGPSQKSSLPPLL 

IPPSENLGPHEEDQVVCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPG\RRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNG\GVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPIPPRPVKPDYRRWSA 

P VT^ <5T V^DP HR PPK" VPPP PPT 9P SN SR TPSPK SLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCILPIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP 


3457 


A 


2 


4869 


FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEKENLPSDY 

MVPIFSGRQKHVSGITDTEEERIKEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 

QSTASRQSTASRQSVVSKQATSALQQEETSEKKS 

RKVVIRGKAERLSLRKTLEETETYHAKLNEDHLL 

HAPEFIIKPRSHTVWEKENVKLHCSIAGWPEPRV 

TWYKNQVPINVHANPGKYIIESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVWKRYKG 

EFDETRFHAGASTMPLSFGVTPYGYASRFEIHFD 

DKFDVSFGREGETMSLGCRVVITPEIKHFQPEIQ 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D^^Aspartic Acid, 
E=Glutamic Acid, F=PhenylalaniDe, G=Glycinc H^Histidine, 
I=Isoleucine, K=Lysine, I/=Leucine, M=Methionlne, 
N=Asparagine, P=Proline, Q=Glutaraine, R=Arginine, S=Serinet 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *==^top codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










WYRNGVPLSPSKWVQTLWSGERATLTFSHLNKE 

DEGLYimVRMGEYYEQYSAYVFVRDADAEIEG 

APAAPLDVKCLEANKDYinSWKQPAVDGGSPIL 

GYFIDKCEVGTDSWSQCNDTPVKFARFPVTGLIE 

GRSYIFRVRAVNKMGIGFPSRVSEPVAALDPAEK 

ARLKS/PPLSTLDWAVIVTEEEPSEGIVPGPPTDLS 

VTEATRSYWLSWKPPGQRGHEGIMYFVEKCEA 

GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 

VRCSNSAGVGEPSEATEVTWGDKLDIPKAPGKI 

IPSRNTDTSVVVSWEESKDAKELVGYYIEANVA 

GSGKWEPCNNNPVKTHRFTCHGLVTGQSYIFRV 

RAVNAAGLSEYSQDSEAIEVKAAIAPPSPPCDITC 

LESFRDSMVLGWKQPDKIGGAEITGYYVNYREV 

IDGVPGKWREANVKAVSEEAYKISNLKENMVY 

QFQVAAMNMAGLGAPSAVSECFKCEEWTIAVP 

GPPHSLKCSEVRKDSLVLQWKPPVHSGRTPVTG 

YFVDLKEAKAKEDQWRGLNEAAIKNVYLKVRG 

LKEGVSYVFRVRAINQAGVGKPSDLAGPVVAET 

RPGTKEWVNVDDDGVISLNFECDKMTPKSEFS 

WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 

DDLGIYSCDVTDTDGIASSYLIDEEELKRLLALSH 

EHKFPTVPVKSELAVEILEKGQVRF\WMQAEKLS 

GNAKVNYIFNEKGIFEGPKYKMHIDRNTGIIEMF 

MEKLQDEDEGTYTFQLQDGKATNHSTVVLVGD 

VFKKLQKEAEFQRQEWIRKQGPHFVEYLSWEVT 

GECNVLLKCKVANIKKETHIVWYKDEREISVDE 

KHDFKDGICTLLITEFSKKDAGIYEVILKDDRGK 

DKSRLKLVDEAFKELMMEVCKKIALSATDLKIQ 

STAEGIQLYSFVTYYVEDLKVNWSHNGSAIRYSD 

RVKTGVTGEQIWLQINEPTPNDKGKYVMELFDG 

K'Tnuri'K'TvnT QnnAvn'PAYAFFnPT t^oaataft^ 

NRARVLGGLPDVVTIQEGKALNLTCNVWGDPPP 
EVSWLKNEKALASDDHCNLKFEAGRTAYFTING 
VSTADSGKYGLWKNKYGSETSDFTVSVFIPEEE 

ARMAALESLKGGKKAK 


3458 


A 


3963 


827 


LSRSSSDNNTNTLGRNVMSTATSPLMGAQSFPNL 

TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGQS 

LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 

STLLAELDDDEDLPEPDEEDDENEDDNQEDQEY 

EEVMBLRRPSLQRRAGSRSDVTHHAVTSQLPQVP 

AGAGSRPIGEQEEEEYETKGGRJRRTWDDDYVLK 

RQFSALVPAFDPRPGRTNVQQTTDLEIPPPGTPHS 

ELLEEVECTPSPRLALTLKVTGLGTTREVELPLTN 

FRSTIFYYVQKLLQLSCNGNVKSDKLRRIWEPTY 

TIMYREMKDSDKEKENGKMGCWSIEHVEQYLG 

TDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 

IRKNRNCSQLIAAYWDLG\EHGTK\SGLNQGAIST 

LQSSDILNLTKEQPQAKAGNGQNSCGVEDVLQL 

LRILYIVASDPYSRISQEDGDEQPQFTFPPDEFTS/ 

KKITTKILQQIEEPLALASGALPDWCEQLTSKCPF 

LTPFETROLYFTCTAFGASRAIVWLONRREATVE 

RTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 

MEWAENVMQIHADRKSVLEVEFLGEEGTGLGPT 

LEFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 

DLGGGLKPPGYYVQRSCGLFTAPFPQDSDELERI 



343 



wo 01/57190 



PCT/USOl/04098 



SEQW 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inrnfififi 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=A(anine C^Cysteine, D=A$partic Acid, 
£=Glutaraic Acid, F^Pbenylalanine, G^'Glycine, H^Histidine, 
I^lsoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N^Asp&ragine, P'Proline, Q^G lutamine^ R^Arginine, S^^Scrinc* 
T=Threonine, V=VaIine, W=Tryptoplian, Y=Tyrosinc 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibIe nucleotide insertion 










TKLFHFLGIFIJVKCIQDNRLVDLPISKPFFKLMCM 

GDIKSNMSKLIYESRGDRDLHCTESQSEASTEEG 

HDSLSVGSFEEDSKSEFILDPPKPKPPAWFNGILT 

WEDFELVNPHRARFLKEIKDLAIKRRQILSNKGL 

SEDEKNTKLQELVLKNPSGSGPPLSIEDLGLNFQF 

CPSSRIYGFTAVDLKPSGEDEMITMDNAEEYVDL 

MFDFCMHTGiQKQMEAFRDGFNKVFPMEKLSSF 

SHEEVQMILCGNQSPSWAAEDIINYTEPKLGYTR 

DSPGFLRPVRVLCGMSSDERKAFLQFTTGCSTLP 

PGGLANLHPRLTVVRKVDATDASYPSVNTCVHY 

LKLPEYSSEEIMRERLLAATMEKGFHLN 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRS\DGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKAVKQKQIRRGVKEVQKFVNKGEKGIMVLA 

GDTLPIEVYCHLPVMCEDRNLPYVYIPSKTDLGA 

AAGSKRPTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDVVAPKPPIEPEEEK 

TLKKDEEN\DSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALYWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLWGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NVVGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTKMlSfKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

lAVGDSEGQIVIYDVGEQIAVPKNDEWARFGRTL 

AEINANRADAEEEAATRIPA 


3461 


A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKTTQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDVVAPKPPIEPEEEK 

TLKKDEEN\DSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

lAVGDSEGQIVIYDVGEQIAVPKNDEWARFGRTL 

AEINANRADAEEEAATRIPA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F«=PhenylaIanine, G=Glycine, H=Hi5tidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Metbionine, 
N=A5paragine, P=ProIine, Q=Glutamine, R=Arginine, S^^rine, 
T=Threonlne, V=Vaiine, W=TryptophaD, Y=Tyrosine, 
X=Unknown, *=^top codon, /=possible nucleotide deletion, 
V^possible nucleotide insertion 


3462 


A 


2 


2643 


TAPEFSRSTHASAHASVARVLRNREIAQLKKEQR 
RQEFQIRALESQKRQQEMVLRRKTQEVSALRRL 
AKPMSERVAGRAGLKPPMLDSGAEVSASTTSSE 
AESGARSVSSIVRQWNRKINHFLGDHPAPTVNGT 
RPARKKFQKKGASQSFSKAARLKWQSLERRIIDI 
VMQRMTIVNLEADMERLIKKREELFLLQEALRR 
KRERLQAESPEEEKGLQELAEEffiVLAANIDYIND 
GITDCQATIVQLEETKEELDSTDTSWISSCSLAE 
ARLLLDNFLKASIDKGLQVAQKEAQIRLLEGRLR 
QTDMAGSSQNHLLLDALREKAEAHPELQALIYN 
VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 
DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 
TKSLASLVEIKEDGVGFSVRDPYYRDRVSRTVSL 
. PTRGSTFPRQSRATETSPLTRRKSYDRGQPIRSTD 
VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 
KSDDSDSSUSEVLRGIISPVGGAKGARTAPLQCV 
SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 
LVTGQEIAALKGHPNNWSIKYCSHSGLVFSVST 
SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 
RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 
WELSRFQPVGKLTGHIGPVMCLTVTQTASQHDL 
VVTGSKJ3HYVKMFELGECVTGTIGPTHNFEPPH 
YDGIECLAIQGDILFSGSRDNGIKKWDLDQQELIQ 
QIPNAHKDWVCALAFIPGRPMLLSACRAGVIKV 
WNVDNFTPIGEIKGHDSPINAICTNAKHIFTASSG 
CRVKVWNYVPGLTPCLPRRVLAIKGRATTLP 


3463 


A 


198 


3146 


SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFA 

GVYRAESIHTGLEVAIKMIDKKAMYKAGMVQR 

VQNEVKIHCQLKHPSILELYNYFEDSNYVYLVLE 

MCHNGEMNRYLKNRVKPFSENEARHFMHQIITG 

MLYLHSHGILHRDLTLSNLLLTRNMNIKIADFGL 

ATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLE 

SDVWSLGCMFYTLLIGRPPFDTDTVKNTLNKVV 

LADYEMPTFLSIEAKDLfflQLLRRNPADRLSLSSV 

LDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAI 

TASSSTSISGSLFDKJm.LIGQPLPNKMTVFPKNK 

SSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIQD 

AEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTM 

ERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 

NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFP 

FADPTPQTETVQQWFGNLQINAHLRKTTEYDSIS 

PNRDFQGHPDLQKDTSKNAWTDTKVKKNSDAS 

DNAHSVKQQNTMKYMTALHSKPEnQQECVFGS 

DPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHR 

LKPIRQKTKKAWSILDSEEVCVELVKEYASQEY 

VKEVLQISSDGNTITIYYPNGG\RGFPLA\DRPPSP 

'nDNISR\YSF\DNLPEKYWRKYQYASRFVQLVRS 

KSPKITYFTRYAKCILMENSPGADFEVWFYDGV 

KIHKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIK 

MYMDHANEGHRICLALESnSEEERKTRSAPFFPri 

IGRKPGSTSSPKALSPPPSVDSNYPTRDRASFNRM 

VlVfflSAASPTQAPILNPSMVTOEGLGLTTTASGTD 

ISSNSLKDCLPKSAQLLKSVFVKNVGWATQ\LTS 

GAVWVQFNDGSQLVVQAGVSSISYTSPNGQ\TTR 

\YGENEKLPDY1KQKLQCLSSILLMFSNPTPNFH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding. 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A'=Alanine C=Cysteine, D=Aspartic Acid, 
£=K?lutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
fM^ASparagine, r^rroiine, v^^' k— Arginine, o— oerine, 
T=Threonine, V=Valinc, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=i)ossible nucleotide insertion 


3464 


A 


14 


348 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 

AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 

EMASKTIPELLKWffiDGIPKDPFLNPDLMKNNPW 

V\EKGKCTIL 


3465 


A 


5537 


.405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGVVRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELIKLNWLLAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGNGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRPOIECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQAARGQDRQQQLQRDPQKALCDLHPSWKEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLKIFMAQYNYNPFEGPNDHPEGELPLTA 

GDYIYIFGDMDEDGFYEGELEDGRJR.GLyPSNFVE 

QIPDSYIPGCLPAKSPDLGPSQLPAGQDEALEEDS 

LLSGKAQGVVDRGLCQMVRVGSKTEVATEILDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHVVYLDD 

REHALTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDLLQVYWGTMSSTVTFDTLLAGPPYPPLDVLV 

ERHASPGVLWSWLPVTIDSAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQKLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

EERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKVIKMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSANLKAAEEELVFQKRQLLRVWGSQDT 

HDFYLSECNRQVGNIPGRLVAEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRLPLWTPKIMIAALDYDPGDGQMGGQGKGRL 

ALRAGDWMVY\GPMDDQGFYYGELGGHRG\L 



346 
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PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

CO rres pond i ng 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A-Alanine C=Cysteine, D^Aspartie Acid, 
E^Glutamic Add, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^GIutamine, R^Arginine* S^^eriQC) 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UDknown, *=Stop codon, /^possible nucleotide deletion, 
V-possible nucleotide Insertion 










VPANLRIKMSSQGH 


3466 


A 


1 


1111 


MSKPPDLLLRLLRGAPRQRVCTLFIIGFKFTFFVSI 

MIYWHYVGEPKEKGQLYNLPAEIPCPTLTPPTPP 

SHGPTPGNBFFLETSDRTNPNFLFMCSVESAARTH 

PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 

QMLPLDLRELFRDTPLADWYAAVQGRWEPYLL 

PVLSDASRIM^MWia^GGIYLDTDFIVLKNLRNLT 

>m.GTQSRYVLNGAFLAFERRHEFMALCMRDFV 

DHYNGWIWGHQGPQLLTRVFKKWCSIRSLAESR 

ACRGVTTLPPEAFYPIPWQDWKKYFEDINPEELP 

RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 

ARYCPTTHE/DHENVLVKGPAGHLPNLLLMGHW 


3467 


A 


1 . 


2175 


MAKVILKQSKQCKNLLTCKVAQVCPVCGCLHC 

YFWWLSGLESRRPSSPLIDIKPIEFGVLSAKKEPIQ 

PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 

TDEEILSKYQLGMLHFSTQYDLLHNHLTVRVIEA ' 

RDLPPPISHDGSRQDMAHSNPYVKICLLPDQKNS 

KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 

TWDFDKFSRHCVIGKVSVPLCEVDLVKGGHW 

WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 

LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 

NLLEAKQQRLVEGEMLFIPARAANLPVNNKPVM 

LLSLVFAPTWLGLSFYDSRTTSLLHPARQIQLPVSL 

QRGEGEAMLSXALTLFSRSPLEQNIIQPLVLSLLHL 

CGSVVNMPPGNSQPRGDFLYHSICTWVQDNYAQ 

PLTRESVAQFFNITPNHLSKLFAQHGTMRFffiYVR 

WVRMAKARMILQKYHLSIHEVAQRCGFPDSDYF 

CRVFRRQFGMDYVDILQIHRWDYNTPIEETLEAL 

NDWKAGKARYIGASSMHASQFAQALELQKQH 

GWAQFVSMQDHYNLIYREEEREMLPLCYQEGV 

AVIPWSPLARGRLTRPWGETTARLVSDEVGKNL 

YKESDENDAQIAERLTGVSEELGATRAQVALAW 

LLSKPGIAAPIIGTSREEQLDELLNAVDITLKPEQI 

AELETPYKPHPVVGFK 


3468 


A 


147 


3209 


ALPLPLPTLYPGMSRRKQRKPQQLISDCEGPSASE 

NGDASEEDHPQVCAKCCAQFTDPTEFLAHQNAC 

STDPPVMVIIGGQENPNNSSASSEPRPEGHNNPQ 

VMDTEHSNPPDSGSSVPTDPTWGPERRGEESSGH 

FLVAATGTAAGGGGGLILASPKLGATPLPPESTP 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQI 

HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 

GTGTASSTKPLLPLFSPIKPVQTSKTLASSSSSSSS 

SSGAETPKQAFFHLYHPLGSQHPFSAGGVGRSHK , 

PTPAPSPALPGSTDQLIASPHLAFPSTTGLLAAQC 

LGAARGLEATASPGLLKPKNGSGELSYGEVMGP 

LEKPGGRHKCRFCAKVFGSDSALQIHLRSHTGER 

PYKCNVCGNRFTTRGNLKVHFHRHREKYPHVQ 

MNPHPVPEHLDYVITSSGLPYGMSVPPEKAEEEA 

ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 

ATAPGLPAFNKFVLMKAVEPKNKADENTPPGSE 

GSAISGVAESSTATRMQLSKLVTSLPSWALLTNH 

FKSTGSFPLPLCARALGVASPSETSKLQQLVEKID 

RQGAVAVTSAASGAPTTSAPAPSSSASSGPNQCV 

ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 

STRGNLRAHFVGHKASPAARAQNSCPICQKKFT 
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PCTAJSOl/04098 



S£QID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lUCUllUU 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

pi^ r rp<i 11 fb n H i n 0 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^'Alanine C-Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
]»Isoleucine, K=Lysine, L=Leucine, M=Methionine, 

i^=A cnarflpin^ P=Prnlin^ 0=r^liifflminp 11=:Ai*ainine. SisfiArinc 

T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosinc, 
X^^Unknown, "^^Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










NAVTLQQHVRMHLGGQBPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEEDVTDEDSLAGRGSESGGEKAISVRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTQQS 

SLPPPPPPDSLDQPQPMEQGSSGVLGGKEEGGKP 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLF\TCVFCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 

1 


3 


5664 


NLRPLSFALFLGDPNMANLEESFPRGGTRKIHKP 

EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 

TKKLKIEKRESSKSAREKFEILSVESLCEGMRILG 

CVKEVNELELVISLPNGLQGFVQVTEICDAYTKK 

LNEQVTQEQPLKDLLHLPELFSPGMLVRCVVSSL 

GITDRGKKSVKLSLNPKNVNRVLSAEALKPGML 

LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 

IRQKNKGAKLKVGQYLNCIVEKVKGNGGWSLS 

VGHSEVSTAIATEQQSWNLNNLLPGLVVKAQVQ 

KVTPFGLTLNFLTFFTGWDFMHLDPKKAGTYFS 

NQAVRACILCVHPRTRVVHLSLRPIFLQPGRPLTR 

LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 

AYARLSHLSDSKNVFNPEAFKPGNTHKCRIIDYS 

QMDELALLSLRTSIIEAQYLRYHDIEPGAVVKGT 

X^TIKSYGMLVKVGEQMRGLVPPMHLADILMK 

NPEKKYfflGDEVKCRVLLCDPEAKKLMMTLKKT 

LIESKLPVITCYADAKPGLQTHGFIIRVKDYGCIV 

KFYNNVQGLVPKHELSTEYIPDPERVFYTGQVV 

KWVLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 

QKKGKAINIGQLVDVKVLEKTKDGLEVAVLPHN 

IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 

CLSQSEGRVLLCRKPALVSTVEGGQDPKNFSEIH 

PGMLLIGFVKSnCDYGVFIQLPSGLSGLAPKAIMS 

DKFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 

SLRLSDCGLGDLAITSLLLLNQCLEELQGVRSLM 

SNRDSVLIQTLAEMTPGMFLDLVVQEVLEDGSV 

VFSGGPVPDLVLKASRYHRAGQEVESGQKKKVV 

ILNYDLLKLEVHVSLHQ\DLV\NRKARKLRKGSE 

HQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 

TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 

GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 

KKHTLSIGDMVTGTVKSIKPTHVVVTLEDGIIGCI 

HASHILDDVPEGTSPTTKLKVGKTVTARVIGGRD 

MKTFKYLPISHPRFVRTIPELSVRPSELEDGHTAL 

NTHSVSPMEKIKQYQAGQTVTCFLKKYNVVKK 

WLEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 

GQALRATWGPDSSKTFLCLSLTGPHKLEEGEVA 

MGRVVKVTPNEGLTVSFPFGKIGTVSIFHMSDSY 

SETPLEDFVPQKVVRCYILSTADNVLTLSLRSSRT 

NPETKSKVEDPEINSIQDIKEGQLLRGYVGSIQPH 

GVFFRLGPSVVGLARYSHVSQHSPSKKALYNKH 

LPEGKLLTARVLRLNHQKNLVELSFLPGDTGKPD 

VLSASLEGQLTKQEERKTEAEERDQKGEKKNQK 

RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 

GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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PCTAJSOl/04098 



SEQ ID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

rnrri'cnnnH i nv 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D^Aspartic Add, 
£=Glutamic Acid, F=Fhenylalanine, G^lyclne, H=Histidine, 
I<=]soleucine, K=Lysine, L^Leucine, M=Methionine, 
NssAcnarflvinp P=Prnlinp O^r^ltitflmlnc^ R=Aivinine< SBSerine. 
T=Threonine, V=V2line, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^possible nadeotide dtletfon, 
^'possible nucleotide insertion 










YYREGKEEAEETNVLPKEKQTKPAEAPRLQLSSG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHQATI 

KKSKKERELEKQKAEKELSRTEEALMDPGRQPE 

SADDFDRLVLSSPNSSILWLQYMAFHLQATEIEK 

ARAVAERALKTISFREEQEKLNVWVALLNLENM 

YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 

EKFQEAGELYNRMLKRFRQEKAVWIKYGAFLLR 

RSQAAA SHRVLQRALECLPSKEHVDVIAKFAQL 

EFQLGDAERAKAIFENTLSTYPKRTDVWSVYID 

MTIKHGSQKDVRDIFERVIHLSLAPKRMKFFFKR 

YLDYEKQHGTEKDVQAVKAKALEYVEAKSSVL 

ED 


3470. 


A 


23.34 


1226 


TAAAPVAPGTMDDATVLRKKGYIVGINLGKGSY 

AKVKSAYSERLKFNVAVKnARKKTPTDFVERFL 

PREMDILATVNHGSIIKTYEIFETSDGRIYIIMELG 

VQGDLLEFIKCQGALHEDVARKMFRQLSSAVKY 

CHDLDIVHRDLKCENLLLDKDFNIKLSDFGFSKR 

CLRDSNGRIILSKTFCGSAAYAAPEVLQSIPYQPK 

VYDIWSLGVILYIMVCGSMPYDDSDIRKMLRIOK 

EHRVDFPRSKNLTCECKDLIYRMLQ\PDVS\KRLH 

IDEILSHSWLQPPKPK\ATSSASFKREGEGKYRAE 

CKLDTKTGLRPDHRPDHBCLGAKTQHRLLWPEN 

ENRMEDRLAETSRAKDHHISGAEVGKAST 


3471 


A 


537 


148 


TERGAPQHPTLPLPSLTPSSVHTGQPKTTPSVILFL 
PSCEEPQANKATLVCLMNN/FYPGILMVTWKAD 
GTLITQSVEKTTPSKQS>WKYVASSYLSLTPEQW 
RSRRSYSCQVMQEGSTVEKSVAPAECS 


3472 


A 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHVVFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNWFGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGKNSEFEGGKST 

VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILbPH 

VVLLTSDNVnUYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHDLCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF 


3473 


A 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHVVFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNVVFGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGDCGLMVLELPKRWGKNSEFEGGKST 
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PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lULallUIl 

corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

1* nrFP Q n n n H i n u 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, I>=Aspartic Add, 
E=Glutamic Acid, F^Phenylalaninc, G=Glycine, H=Histidine, 
I^Isoleudne, K=Lysine, L=Leucine, M=Metbionine, 

N=Acnnrnffin^ P=Prnline. 0=r^httnmfnp R=Arpininp fisS^rine. 

T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILDPH 

WLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK 

KOLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMhniMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF 


3474 


A 


4344 


2550 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 

KRDDFLDLAESPNASDTECSDEIPLKVPRTSPRDS 

EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 

HLEIALLEKHFLQEELRKLREETNAEMLRQELDR 

ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 

AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

RPQPEENTVETEEPLSARRLTENMRRLKRGAKPV 

TNFVKNLSALSDWSVYTSAL^FIVYMNAVWH 

GWAIPLFLFLAILRLSLNYLIARGWRIQWSIVPEV 

SEPVEPPKEDLTVSEKFQLVLDVAQKAQNLFGK 

MADILEKIKNLFMWVQPEITQKLYVALWAAFLA 

SCFFPYRLVGLAVGLYAGIKFFLIDFIFKRCPRLR 

AKYDTPYIIWRSLPTDPQLKERSSAAVSRRLQTTS 

SRSYVPSAPAGLGKEEDAGRFHSTKKGNFHEIFN 

LTENERPLAVCENGWRCCLINRDRKMPTDYIRN 

GVLYVT\ENYLCFESSKSGSSKRNKVIKLVDITDI 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETILSQYIKITSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 

lELMBSRKDITNQEELWKMKPRRNLEEDDYLHK 

DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQELFPQWHLPIKIAAIIASLTFLYTLLREVmPLA 

TSHQQYFYKIPILVINKVLPMVSITLLALVYLPGV 

lAAlVQLHNGTKYKKFPHWLDKWMLTRKQFGL 

LSFFFAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNKEDAL\IEHDVWRMEIYVSLGIVGLAILAL 

LAVTSIPSVSDSLTWREFHYIQSKLGIVSLLLGTIH 

ALIFAWNKWIDIKQFVWYTPPTFMIAVFLPrWLI 

FKSILFLPCLRKKILKIRHGWEDVTKINKTEICSQL 


3476 


A 


143 


3191 


AKAPPTGESSEPEAKVLHTKRLYRAVVEAVHRL 

DLILCNKTAYQEVFKPENISLRNKLRELCVKLMF 

LHPVDYGRKAEELLWRKVYYEVIQLIKTNKKHI 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL" 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQL 



350 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Lcucine, M=Methioolne, 
N*"Asparagine, P^Proline, Q=G1 utamine, RBArginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosinc, 
X=lJnknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KKCETRKLSPGKKRCKDIKiaLVimiYLQSLLQ 

PKSSSVDSELTSLCQSVLEDFM.CLFYLPSSPNLS 

LASEDEEEYESGYAFLPDLLIFQMVIICLMCVHSL 

ERAGSKQYSAAIAFTLALFSHLXWIVNIRLQAEL 

EEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPP 

PVTPQVGEGRKSRKFSRLSCLRRRRHPPKVGDDS 

DLSEGFESDSSHDSARASEGSDSGSDKSLEGGGT 

AFDAETDSEMNSQESRSDLEDMEEEEGTRSPTLE 

PPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQM 

FQTKRCFRIAPTFSNLLLQPTTNPHTSASHRPCV 

NGDVDKPSEPASEEGSESEGSESSGRSCRNERSIQ 

EKLQVLMAEGLLPAVKVFLDWLRTNPDLErVCA 

QSSQSLWNRLSVLLNLLPAAGELQESGLALCPEV 

QDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 

RRFNFDTDRPLLSTLEESVVRICCIRSFGPfflARLQ 

GSILQFNPEVGIFVSIAQSEQESLLQQAQAQFRMA 

QEEARRNRLMRDMAQLRLQLEVSQLEGSLQQPK 

AQSAMSPYLVPDTQALCHHLPVIRQLATSGRFIVI 

IPRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGN 

RYIRCQKEVGKSFERHKLKRQDADAWTLYKILD 

SCKQLTVLAQGAGEEDPSGMVTIITGLPLDNPSVL 

SGPMQAALQAAAHASVDIK>IVLDFYKQWKEIG 


3477 


A 


1 


3902 


MTEPRERRGYSVPPRPEVGTQATEWRVEESNFN 

KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL 

GGDAVATTGEfflEEKAWKTRALEVGQPAQRDIR 

RGELWGKEHGADQAIQETLEDLSSLERTLVVSES 

SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSQ 

ELGREERRRQAGAAFQVLQLPQALPIQVDSEEGL 

LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 

HVEIQVLDINDHQPRFPKGEQELEISESASLRTRIP 

LDRALDPDTGPNTLHTYTLSPSEHFALDVIVGPD 

ETKHAELIVVKELDREfflSFFDLVLTAYDNGNPP 

KSGTSLVKVNVLDSNDNSPAFAESSLALEIQEDA 

APGTLLIKLTATDPDQGPNGEVEFFLSKHMPPEW 

LDTFSIDAKTGQVILRRPLDYEKNPAYEVDVQAR 

DLGPNPIPAHCKVLIKVLDVNDNIPSIHVTWASQP 

SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 

SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 

YTLTLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 

EKSRYEVSTRENNLPSLHLITIKAHDADLGINGK 

VSYRIQDSPVAHLVAIDSNTGEVTAQRSLNYEEM 

AGFEFQVIAEDSGQPMLASSVSVWVSLLDANDN 

APEVVQPVLSDGKASLSVLVNASTGHLLVPIETP 

NGLGPAGTDTPPLATHSSRPFLLTTIVARDADSG 

ANGEPLYSIRSGNEAHLFILNPHTGQLFVNVTNA 

SSLIGSEWELEIVVEDQGSPPLQTRALLRVMFVTS 

VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 

LALFMSICRTEKKDNRAYNCREAESTYRQQPKR 

PQKfflQKADIHLVPVLRGQAGEPCEVGQSHKDV 

DKEAMMEAGWDPCLQAPFHLTPTLYRTLRNQG 

NQGAPAESREVLQDTVNLLFNHPRQRNASRENL 

NLPEPQPATGQPRSRPLKVAGSPTGRLAGDQGSE 

EAPQRPPASSATLRRQRHLNGKVSPEKESGPRQI 

LRSLVRLSVAAFAERNPVEELTVDSPPVQQISQLL 

SLLHQGQFQPKPNHRGNKYLAKPGGSRSAIPDTD 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inraflnh 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
. acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Asparttc Acid, 
£=Glutamic Acid, F=Phenyialanine, G'^GIycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparaginei P=Proline, Q=GIutaraine, R=Argininey S^erin^ 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possibIe nucleotide insertion 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

TRLASTFVSEMSSLLEMLLEQRSSMPVEAASEAL 

RRLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 


3478 


A 


13 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KLRLAGDQRNASYPHCLQFYLQPPSENISLIEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKLKFSYRENLEDEYEPRRRDHISHFILRLAYCQS 

EELRRWFIQQEMDLLRFRFSILPKDKIQDFLKDSQ 

LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESIY 

KIPFADALDLFRGRKVYLEDGFAYVPLKDIVAIIL 

NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC 

MRQLHKALRENHHLRHGGRMQYGLFLKGIGLT 

LEQALQFWKQEFIKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYTPFSCLKIILSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFS\LSHPNQYFCESQRI 

LNGGKDIKKEPIQPETPQPKPSVQKTKDASSALA 

SLNSSLEMDMEGLEDYFSEDS 


3479 


A 


698 


138 


RPELELWRLRSRSWRPLGVPRRCHRRNWKEPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAILP 

EAARARRJRRRTDVRITG 


3480 


A 


117 


2226 


RRGSRSRGPFAEPAAPGGLCSSSEEKTEEGGMAV 

GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 

KDLYRDVMLENYRNLVWLGLSISKPNMISLLEQ 

GKEPWMVERKMSQGHCADWESWWEIEELSPK 

WFIDEDEISQEMVMERLASHGLECSSFREAWKY 

KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 

SN/IWEKHTPEISIFNTTES\PTIQQVHKFDIYDKLF 

PQNSVIIEYKRLHAEKESLIGNECEEFNQSTYLSK 

DIGIPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 

KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 

NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 

FRQSAHLAQHQRIHTGEKPFACNECGKAFSRYAF 

LVEHQRIHTGEKPYECKECNKAFRQSAHLNQHQ 

RIHTGEKPYECNQCGKAFSRRIALTLHQRIHTGE 

KPFKCSECGKTFGYRSHLNQHQRIHTGEKPYECI 

KCGKFFRTDSQLNRHHRIHTGERPFECSKCGKAF 

SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY 

LNQHQRIHTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 

TSLGPLHQGRRFPEAPAAHPGGTGFTVCAS 


3481 


A 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 
RLREKLFSGYDSSVRPAREVGDRVRVSVGLILAQ 
LISLNEKDEEMSTKVYLDLEWTDYRLSWDPAEH 
DGIDSLRITAESVWLPDWLLNNNDGNFDVALDI 
SVWSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E'^GIutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I'°Isoleucine, K=Lysine, L^Leucinc, M=»Methionine, 
N=Asparagine, P^ProHne, Q^lutamine, R=Arginine, S=Scrine, 
T^Thrconine, V^Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=posslble nucleotide insertion 










HEGTFIENGQWENIHKPSRLIQPPGDPRGGREGQ 

RQEVIFYLIIRRKPLFYLVNVIAPCILITLLAIFVFY 

LPPDAGEKMGLSIFALLTLTVFLLLLADKVPETSL 

SVPIIIKYLMFTMVLVTFSVILSVVVLNLHHRSPH 

THQMPLWVRQIFIHKLPLYLRLKRPKPERDLMPE 

PPHCSSPGSGWGRGTDEYFIRKPPSDFLFPKPNRF 

QPELSAPDLRRFIDGPNRAVALLPELREVVSSISYI 

ARQLQEQEDHDALKEDWQFVAMVVDRLFLWTT 

IIFTSVGTLWIFLDATYHLPPPDPFP 


3482 


A 


1273 


172 


ERWDSGGADAEWYALADWTAVWLPRSDFYTR 

LQTGEGHVPALRLPAGMPPDSPRELVPKQAPCSP 

SDPALPWTLGHGNQPPAWPEPQGPMGPAGVAA 

RPGRFFGVYLLYCLNPRYRVRWYVGFTVNTARR 

VQQHNGGRKKGGAXGRTSGRGPWEMVLWHGF 

PSSVAALRFEWAWQHPHASRRLAPIVGPRLRGET 

AFAFHLRVLAHMLRAPPWARLPLTLRWVRPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP 

GCLLRAHVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 

LLET 


3483 


A 


230 


3686 


WRPWPCIDTSWNLQVAARTLRVSSAQCGLVPT 

MARVESPVPAARASLTGSCVLGQAMPLRGGAGP 

SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 

LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIQ 

DQLKKRFAYLSGGRGQDGSPVITFPDYPAFSEIPD 

KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 

TSVKASVLRIAASFPANLQLVLVLRPTGFFQRTLS 

DIAFKFNRDDFKMKVPVIMLSSVPDLHGYIDKSQ 

LTEDLGGTLDYCHSRWLCQRTAIESFALMVKQT 

AQMLQSFGTELAETELPNDVQS'nSSVLCAHTEK 

KDKAKEDLRLALKEGHSVLESLRELQAEGSEPSV 

NQDQLDNQATVQRLLAQLNETEAAFDEFWAKH 

QQKLEQCLQLRHFEQGFREVKAILDAASQKIATF 

TDIGNSLAHVEHLLRDLANFQEKSGVF VERARA 

LSLTASSFIGNKHYAVDSIRPKCQELRHLCDQFSA 

EIARRRGLLSKSLELHRRLETSMKWCDEGFVTLA 

SQPVDKCQSQDGAEAALQEIEKFLETGAENKIQE 

LNAIYKEYESILNQDLMEHVRKVFQKQASMEEV 

FHRRQASLKKLAARQTRPVQPVAPRPEALAKSP 

CPSPGIRRGSENSSSEGGALRRGPYRRAKSEMSES 

RQGRGSAGEEEESLAILRRHVMSELLDTERAYVE 

ELLCVLEGYAAEMD]SnPLlVL\HLLSTGLHNKK^ 

LFGNMEEIYHPHNRIFLRELENYTDCPELVGRCF 

LERMEDFQIYEKYCQNKPRSESLWRQCSDCPFFQ 

ECQRKLDHKLSLDSYLLKPVQRITKYQLLLKEM 

LKYSRNCEGAEDLQEALSSILGILKAVNDSMHLI 

AITGYDGNLGDLGKLLMQGSFSVWTDHKRGHT 

KVKELARFKPMQRHLFLHEKAVLFCKKREENGE 

GYEKAPSYSYKQSLNMAAVGITENVKGDAKKFE 

IWYNAREEVYIVQAPTPEIKAAWVNEIRKVLTSQ 

LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 

NIKKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 

SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 

GPKKLVPGKYTVVADHEKGGPDALRVRSGDVV 



353 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locatioD 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
vurrcsponQiDg 
to Jast amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutamlc Acid, F=PhenyIalanine, G=Glycine, H^Histldlne, 
I-Isoieucine, K=Lysine, I/=Leucine, M^Metbionine, 
ii**Asparagin6t r^r roiinc) ^^=\jiuiaiiiine} iv^Aigininct o^oennCy 
T=Tbreonine, V=Valine, W=To'ptophan, V-iyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










ELVQEGDEGLW 


3484 


A 


208 


6103 


VTMAQQAADKYLYVDKNFINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEATVELVENGK 

KVKVNKDDIQKMNPPKFSKVEDMAELTCLNEAS 

VLHNLKERYYSGLIYTYSGLFCWINPYKNLPIYS 

EEIVEMYKGKKRHEMPPHIYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTKKVIQYLAYVASSH 

KSKKDQGELERQLLQANPELEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGANIETYLLEKSRAIRQ 

AKEERTFHIFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNTVFKKERNTDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAIEALAKATYERMFRWLVLRINK 

ALDKTKRQGASFIGILDIAGFEEFDLNSFEQLCINY 

TNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFG 

LDLQPCIDLIEKPAGPPGILALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY 

AGKVDYKADEWLMKNMDPLNDNIATLLHQSSD 

KFVSELWKDVDRIIGLDQVAGMSETALPGAFKT 

RKGMFRTVGQLYKEQLAKLMATLKNTNPNFVR 

CIIPNHEKKAGKLDPHLVLDQLRCNGVLEGIRICR 

QGFPNRVVFQEFRQRYEILTPNSIPKGFMDGKQA 

CVLMIKALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVIIGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

QVSRQEEEMMAKEEELVKVREKQLAAENRLTE 

METLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKKLEEEQULEDQNCKLAKEKKLLEDRIAEF 

TTNLTEEEEKSKSLAKLKNKHEAMITDLEERLRR 

EEKQRQELEKTRRKLEGDSTDLSDQIAELQAQVIA 

ELKMQLAKKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCERNASRNKAEKQKRDLG 

EELEALKTELEDTLDSTAAQQELRSBOyEQEVNIL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKVLLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELLQEENRQKLSLSTKLKQVEDE 

KNSXFREQLEEEEEEAKHNLEKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKRKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ 

SACNLEKKQKKFDQLLAEEKTISAKYABERDRA 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRADEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE 

LEDERKQRSMAVAARKKLEMDLKDLEAHIDSA 

NKNRDEAIKOLRKLOAOMKDCMRELDDTRASR 

EEILAQAKENEKKLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEELEEEQGNTELINDRLKKANLQIDQl 

NTDLNLERSHAQKNENARQQLERQ^JKELKVKL 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inrnf'ioii 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=»Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
Msoleucine, K=Lysine, L=Lcucinc, M=Methionine, 
N==Asparagine, P=Proline, Q^GIutamine, R^Arginine, S^=SeriDCt 
T=Threonine, V=Valine, W^Tryptophao, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=pos$ible nucleotide deletion, 
^possible nucleotide insertion 










QEMEGTVKSKYKASITALEAKIAQLEEQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERRNAEQ 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RKLQRELEDATETADAMNREVSSLKNKLRRGDL 

PFWPRRMARKGAGDGSDEEVDGKADGAEAKP 

AE 


3485 


A 


2 


1782 


CSTGVSKAPLTYLMSYGFELGWRKGNRAVACR 

EDRGGESVGMGQESILSQVHWWEAEPVEKTPGR 

DSEATIMSLRVHTLPTLLGAVVRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATDIVSCTNKNLSKVP 

GNLFRLIKRLDLSYNRIGLLDSEWIPVSFAKLNTL 

ILRHNNITSISTGSFSTTPNLKCLDLSSNKLK'AVK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSQL 

QKLYLSGNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRIPSMPMHHINLVPGKQLRGIYLHGNPFVCDX 

CSLVSLLVFWYRRHFSSVMDFKNDYTCRLWSDS 

RHSRQVLLLQDSFMNCSDSIINGSFRALGFIHEAQ 

VGERLMVHCDSKTGNANTDFIWVGPDNRLLEPD 

KEMENFYVFHNGSLVIESPRFEDAGVYSCIAMNK 

ORLLNETVDVTINVSNFTVSRSHAHEAFNTAFTT 

LAACVASIVLVLLYLYLTPCPCKCKTKRQKNML 

HQSNAHSSILSPGPASDASADERKAGAGKRVVFL 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

SDSDSVNSVFSDTPFVAST 


3486 


A 


357 


1173 


GDPRETKVFPSRSFARNTVGVSHHQSHLFHTVSR 

lYVEDKHKILYCEVPKAGCSNWKRILMVLNGLA 

SSAYNISHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTKMLVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAIIKKYRPNACEEALINGSGVKFKEFIHYLLDS 

HRPVGMDIHWEKVSKLCYPCLINYDFVGKFETL 

EEDANYFLQMIGAPKELKFPNFKDRHSSDERTNA 

QVVRQYLKDLTRTERQLIYDFV^LDYLMFNYTT 

PFL 


3487 


A 


2 


3281 


CDKSGAVPFSTTRSPRRPSPRSAGPSLSSVSPRSQ 

LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 

AQGAMRFCSEGDCAISPPRCPRRWLPEGPVPQSP 

PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCTP 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 

PALGPGSNRKLRLEASTSDPLPARGGSALPGSRN 

LVHGPPAPPQVGADGLYSSLPNGLGDPPERLATL 

FGGPADTGFLNQGDTWSSPREVSSHAQRIARAK 

WEFFYGSLDPPSSGAKPPEQAPPSPPGVGSRQGS 

GVAVGRAAKYSETDLDTVPLRCYRETDIDEVLA 

EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP 

HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA 

RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFEAI 

LESHRAKGTSYTSLASLEALASPGPTQSPFFTFEL 

PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

QRGEEEEAEARAKXAPGREPPSPCHSEDSLGLGA 

APLGSEPPLSQLVSDSDSELDSTERLALGSTDTLS 

NGQKADLEAAQRLAKRLYRLDGFRKADVARHL 

GKNNDFSKLVAGEYLKFFVFTGMTLDQALRVFL 

KELALMGETQERERVLAHFSQRYFQCNPEALSSE 

DGAHTLTCALMLLNTDLHGHNIGKRMTCGDFIG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locBtion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
Lurrcspuiiuing 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E>=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K==Lysine, lr=Leucine, M=Methionine, 

AAparugiuc, K — X ruiiuCf \^^vviui4iiiinC) xv~ArgiiiiuC) 0"OcriiiC) 

T=nirtonine, V=Valine, W=Tryptopban, Y'O'yrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
V=po$sible nucleotide insertion 










NLEGLNDGGDFPRELLKALYSSIKNEKLQWAIDE 

EELRRFLSELADPNPKVIKRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSFHGILKGMILYLQKEEYKPGKALSETELKN 

AISIHHALATOAS\NYSKRPHVFYLRTADWRVFL 

FQAPSLEQMQSWITRINWAAMFSAPPFPAAVSS 

OKKFSRPLLPSAATRLSOEEOVRTHEAKLKAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 


3488 


A 


441 


1968. 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFLVASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAP\YDSRLLGSARPELG AALGI 

YGAPYAAAAAAQSYFGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGAVELSGAGRRKNATRETTSTLKAWLNEHR 

KNPYPTKGEKIMLAIITmXLTQVSTWFANARRR 

LKKENKMTWAPKNKGGEERKAEGGEEDSLGCL 

TADTKEVTASQEARGLRLSDLEDLEEEEEEEEiEA 

EDEEVVATAGDRLTEFRKGAQSLPGPCAAAREG 

RLERRECGLAAPRPSFNDPSGSEEADFLSAETGSP 

T?T TMTTYPPT FKPRTW^sl AHTATA9A VFGAPPARP 

RPRSPECRMIPGQPPASARRLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPVVQCQYP 

SGAEGSGPPAALGVSMQKTPTYRPARQLHTLCH 

SSLP 


3489 


A 


718 


2073 


lAAYHKALSYRGHVHANNRGTNNVHFTPPPSPS 

RGILPMNPRNMMNHSQVGQGIGIPSRTNSMSSSG 

LGSPNRSSPSIICMPKQQPSRQPFTVNSMSGFGMN 

RNQAFGMNNSLSSNIFNGTDGSE>rVTGLDLSDFP 

AIJ^DRNRREGSGNPTPLINPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSSYKDPTSSNDDSK 

SNLNTSGKTTSSTDGPKFPGDKSSTTQNNNQQKK 

GIQVLPDGRVTNIPQGMVTDQFGMIGLLTFIKAA 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCl^PODIDFHVPSEYLTNIHTRDKLFFFFS 

A tv T\ij%jx wxvr v^j^Ai^i. XL y i uXmi X x^ X 1^ xx xxx\x^x\ t.jx x x x ^ 

W/TAIKLGRYGEDLLFYLYYMNGGDVLQLLAAV 
ELFNRDWRYHKEERVWITRAPGMEPTMKTNTY 
ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 
PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QLQLHYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVFKLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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SEQID 
NO: 


Metbod 


Predicted 
beginning 
nucleotide 

Inonf inn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
£=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, Kr=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=^Glutamine« R=Arginine, S^^crinCf 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *=Stop codon, A»possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










RFHCKLCECSFNDLNAKDLHXOIGRRHRLQYRKK 

WPDLPIATCPSSRARKVLEERMRKQRHLAEERL 

EQLRRWHAERRELEEEPPQDVPPHAPPDWAQPL 

LMGRPESPASAPLQPGRRPASSDDRHVMCKHATl 

YPTEQELLAVQRAVSHAERALKLVSDTLAEEDR 

GRREEEGDKRSSVAPQTRVLKGVMRVGILAKGL 

LLRGDRNVRLALLCSEKPTHSLLRRIAQQLPRQL 

QMVTEDEYEVSSDPEANIVISSCEEPRMQVTISVT 

SPLMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVIVIRVLRDLCRRV 

FRWGALPAWAMELLVEKAVSSAAGPLGPGDAV 

RRVLECVATGTLLTDGPGLQDPCEBIDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 


3491 


A 


2 


1321 


FVGDGALSGCRRGRAPRVPSMAGSLPPCVVDCG 

TGYTKLGYAGNTEPQFIIPSCIAIRESAKVVDQAQ 

RRVLRGVDDLDFFIGDEAIDKPTYATKWPIRHGII . 

EDWDLMERFMEQVVFKYLRAEPEDHYFLMTEP 

PLNTPENREYLAEIMFESFNVPGLYIAVQAVLAL 

AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 

YVIGSCIKHIPIAGRDITYFIQQLLREREVGIPPEQS 

LETAKAIKEKYCYICPDIVKEFAKYDVDPRKWIK 

QYTGINAINQKKFVIDVGYERFLGPEIFFHPEFAN 

PDFMESISDVVDEVIQNCPIDVRRPLYKNVVLSG 

GSTMFRDFGRRLQRDLKRVVDARLRLSEELSGG\ 

RIKPKPVEVQWTHHMQRYAV\WFGG\SMLASTP 

EFFQVCHTKKDYEEYGPSICRHNPVFGVMS 


3492 


A 


3 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 

REESPAPSRAPASASLWRRLVVVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHmPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEVV\LYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3493 


A 


3 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 

REESPAPSRAPASASLWRRLVVVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAAVLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 
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SCQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^GlycEne, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methiontne, 
N'='Asparagine} P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosinc, 
X'^Unknown, *=Stop codon, /=possible nucleotide deletion, 
V^possible nucleotide insertion 










WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLXT^VTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEVV\LYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTTAPF 

PGLVQRRSRLLIVSQVRYFLIQsIKVSPDLCNEDGL 

TALHQCCIDNFEEIVKLLLSHGANVNAKDNELW 

TPLHAAATCGHINLVKILVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVmTCMAYQGITQEKINEMRV 

APEQQMIADIHCMIAAGQDLDWIDAQGATLLHI 

AGANGYLRAAELLLDHGVRVDVKDWDGWEPL 

HAAAFWGQMQMAELLVSHGANU.NARTSMDE 

MPIDLCEEEEFKVLLLELK\HKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKVVRRTQPVGTGPNLXYR 

KEYE/GEEAILWQRSAXAEDQRTSTYNGDIREm 

TDQENKDPNPRLEK\PVLLSEFPTKIPRGELDMPV 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPLIG 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPffiEM 

EEKVHGCCRIS 


3495 


A 


327 


1078 


APMADTTPNGPQGAGAVQFMMTNKLDTAMWL 

SRLFTVYCSALFVLPLLGLHEAASFYQRALLANA 

LTSALRLHQRLPHFQLSRAFLAQALLEDSCHYLL 

YSLIFVNSYPVTMSIFPVLLFSLLHAATYTKKVU 

DARG\SNSLPLLR\SVLDKLSANQQNILKFIACNE1 

FLMPATVFMLFSGQGSLLQPFIYYRFLTLRYSSRR 

NPYCRTLFNELRIWEHIIMKPACPLFVRRLCLQS 

lAFISRLAPTVP 


3496 


A 


3 


2867 


SSRTREMEEKEELRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNWIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQKEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

DAGHTDQPVPSGSVGGPARPASGPRQAREASLV 

VTCRTNKFRKNNYKWVAASSKSPRVARRALSPR 

VAAENVCKASAGMANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGD\RPALAHSGLKPLSG 

ETPLSAYKVKTRTKIIRRRGSTSLPGDKKSGTSPA 

ATAKSHLSLRRRQALRGKSSPVLKKTPNKGLVQ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lUvailUIi 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

rn rr^c nfi n i no 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine,]>=Aspartic Acid, 
£=G]utamic Acid, F=PhenylaIanine, G=GIycine, H=Histidine, 
I=l5oleucine, K=Lysine, L=Leucine, M=Methionine, 

]VbA cnnrflcinp P=Pmlinp O^fvliitAminp RsA raining Q^s^prSnp 

/^OJiai aglllwy Jl M I Willi V, \^ VjIU Klllllllb, X% /U KlUllIC, O^^JCl lUC, 

T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyroslne, 
X=UnkDOWo, *=Stop codon, possible nucleotide deletion, 
^possible nucleotide insertion 










VTKHRLCRLPPSRAHLPTKEASSLHAVRTAPTSK 

VKTRYRIVKKTPASPLSAPPFPLSLPSWRARRLS 

LSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYR 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAIIRQARQRREKRK 

EYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNCPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHEXAPSLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 

KPLHIKPRL 


3497 


A 


1586 


141 


ATARDLGCARRIDRVVMESTPSRGLNRVHLQCR 

NLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSL 

AKNWVMRMLFLEQPLPQAAVALWVKKEFSKA 

QEESTGLLSGLRIWHTQLLPGGLQGLELNPIFRQN 

LRIALLGGGKAWSDDTSQLGPDKHARDVPSLDK 

YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 

GLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFM 

LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 

VEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYP 

T/RALAINLSSGVSGAGGTVHQPGFIV\VETNYRL 

YAYTESELOIALIALFSEMLYPFPVNMVV\ARVTR\ 

X jTk X X X.tiJX./XJ\^XJ\.X.tXtM.M^X iJX^XyxX^ X X X X XX^ITX V V ViVXX, V X Xx.\ 

ESVQQAIASGITAQQIIHFLRTRAHPVMLKQTPVL 
PPTITDQIRLWELERDRLRJFTEGVLYNQFLSQVDF 
ELLVLAHAPKLG VLVFE/NTPAKRLMVVTPAGHS 
DVKRFWKRQKHSS 


3498 


A 


790 


190 


RDLGPAALMTASASSFSSSQGVQQPSIYSFSQITR 

SLFLSNGVAANDKLIXSSNRITAIVNASVGSGORI 

LRGNLQYIKVPVTDARDSRLYDFFDPIADLIHTVS 

MRQGRTLLNCMAGVMSRSASLCLAYLMKYHSM 

SVLLDAHTWA/TKSRRPHRPNNGFWEQLINYEFK 

LFNNNTVRMINSPVGNIPDIYEKDLRMMISM 


3499 


A 


31 


1586 


TAGFLLAPLEMQRLLTPVKRILQLTRAVQETSLT 

PARLLPVAHQRFSTASAVPLAKTDTWPKDVGDL 

ALEVYFPAQYVDQTDLEKYNNVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTVVQRLMERIQLPWD 

SVGRLEVGTETIIDKSKAVKTVLMELFQDSGNTD 

lEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMVVCGDIAVYPSGNARPTGGAGAVAMLIGPK 

APLALERGLRGTHMENVYDFYKPNLASEYPIVD 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTLDDLQYMIFHTPFCKMVQKSLARLMFNDF 

LSASSDTQTSLYKGLEAFGGLKLEDTYTNKDLD 

KALLKASODMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPL\DKLVSSTSDLPKRLASRKC 

VSPEEFTEIMNQREQFYHKVNFSPPGDTNSLFPGT 

WYLERVDEQHRRKYARRPV 


3500 


A 


185 


2692 


MLPTEVPQSHPGPSALLLLQLLLPPTSAFFPNIWS 
LLAAPGSITHQDLTEEAALNVTLQLFLEQPPPGRP 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGQGR 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lULUllUll 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

mrrpctv>n<iinQ 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I»Isoteucine, K=Lysine, Lr=Leucine, M=Methionine, 

]V=AQn!it*!ioiTif^ P=PrnIinp O^rwliitfliriinp 1{»Ai*oininp S— Spring 

T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possib]e nucleotide insertion 










ARLVGALRETVVAARALDHTLARQRLGAALHA 

LQDFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

QVADPTCSDCEELSCPRNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKLALLASIQAFSLLRSRLGDRDFS 

RLLDITPASSLSFVLDTTGSMGEEINAAKIQARHL 

VEQRRGSPMEPVHYVLVPFHDPGFGPVFTTSDPD 

SFWQQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDIFVFTDASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARREILSPLRFEPYKAVALASGG 

EVIFTKDQHIRDVAAIVGESMAALVTLPLDPPVV 

VPGQPLVFSVDGLLQKITVRIHGDISSFWIKNPAG 

VSQGQEEGGGPLGHTRRFGQFWMVTMDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFGIPME 

DGPHPGLYPLTQPVAGLQTQLLVEVTGLGSRAN 

PGDPQPHFSHVILRGVPEGAELGQVPLEPVGPPE 

RGLLAASLSPTLLSTPRPFSLELIGODAAGRRLHR 

AAPQPSTVVPVLLELSGPSGFLAPGSKVPLSLRIA 

SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 

WGRLWLEVPDSAAPDSVVMVTVTAGGREANPV 

PPTHAFLRLLVSAPAPQDRH 


3501 


A 


1245 


5815 


RRAHPSHSRLSPYLSVSRDPYFFVTVSRTILTLSA 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

NTTLFIDQVEAKWVEVKSKREIDMTVFSGLFVGG 

LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 

SQVLPVDSGEVKLDDEPPNSGGG\SPCEAGEEGE 

GGVCLNGGVCSVVDDQAVCDCSRTGFRGKDCS 

QEDNNVEGLAHLMMGDQGKEEYIATFKGSEYF 

CYDLSQNPIQSSSDEITLSFKTLQRNGLMLHTGKS 

ADYVNLALKNGAVSLVINLGSGAFEALVEPVNG 

KFNDNAWHDVKVTRNLRQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GDPKMKIHGVVAFKCENVATLDPITFETPESnSL 

PKWNAKKTGSISFDFRTTEPNGLILFSHGKPRHQ 

KDAKHPQMIKVDFFAIEMLDGHLYLLLDMGSGT 

IKIKALLKKVNDGEWYHVDFQRDGRSGTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 

PTEVWTALLNYGYVGCIRDLFIDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPVVMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRLELDAGRVKLTVNLDCmiNCNSS 

KGPETLFAGYNLNDNEWHTVRWRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNIETGUTERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 

SMHLFFQFKTTSLDGLILYNSGDGNDFIWELVK 

GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 

MISRDTSNLHTVKIDTKITTQITAGARNLDLKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN 

G\RLP\DLISDGSFSCNGTDSRRGMWKGPSTnCQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Grulamic Acid, F=PhenyIa!anine, G^GIycine, H^Histidine, 
I^'IsoIeucine, K=Lysine, L=Leucine, M^Methionine, 

T=Threonine, V=Valine, W=Tryptophan, y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EDSCSNQGVCLQQWDGFSCDCSMTSFSGPLCND 

PGTTYIFSKGGGQITYKWPPNDRPSTRADRLAIGF 

STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 

FNVGTDDIAIEESNAIINDGKYHWRFTRSGGNA 

TLQVDSWPVIERYPAGRQLTIFNSQATniGGKEQ 

GQPFQGQLSGLYYNGLKVLNMAAENDANIAIVG 

NVRLVGEVPSSMTTESTATAMQSEMSTSIMETTT 

TLATSTARRGKPPTECEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANPTRAGGREPYPGSAEVIRE 

SSSTTGlVrVVGIVAAAALCILILLYAMYKYRNRDE 

GSYHVDESRNYISNSAQSNGAVVKEKQPSSAKSS 

NKNKKNKDKEYYV 


3502 


A 


394 


72 


KPAHLPFTVIIMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF 


3503 


A 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLS 

SLPPPPSRALAPTRAPDTALTIMEVAEVESPLNPS 

CKIMTFRPSMEEFREFNKYLAYMESKGAHRAGL 

AKVIPPKEWKPRQCYDDIDNLLIPAPIQQMVTGQ 

SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 

LDYEDLERKYWKNLTFVAPIYGADINGSIYDEGV 

DEWNIARLNTVLDVVEEECGISIEGVNTPYLYFG 

MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 

PEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLIS 

PSVLKKYGIPFDKITQEAGEFMITFPYGYHAGFN 

HGFNCAEStNFATVRWIDYGKVAKLCTCRKDM 

VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 

TPASTPEVKAWLQRRRKVRKASRSFQCARSTSK 

RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 

KSEAAVKLRNTEASSEEESSASRMQVEQNLSDHI 

KLSGNSCLSTSVTEDIKTEDDKAYAYRSVPSISSE 

ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 

ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 

NSFKVPSIAEGENKTSKSWRHPLSRPPARSPMTL 

VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 

WQTKPPNFAAEQEYNATVARMKPHCAICTLLMP 

YHKPDSSNEENDARWETKLDEWTSEGKTKPLIP 

EMCFIYSEENIEYSPPNAFLEEDGTSLLISCAKCC 

VRVHASCYGPSHBICDGWLCARCKRNAWTAEC 

CLCNLRGGALKQTKNNKWAHVMCAVAVPEVR 

FTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 

GACIQCSYGRCPASFHVTCAHAAGVL\MEPDDW 

PYVVNITCFRHKVNPNVKSKACEKVISVGQTVIT 

KHRNTRYYSCRVMAVTSOTFYEVMFDDGSFSRD 

TFPEDIVSRDCLKLGPPAEGEVVQVKWPDGKLY 

GAKYFGSNIAHMYQVEFEDGSQIAMKREDIYTL 

DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 

QAQQETYLGFWINSKKSQCNIFLSGTY 


3504 


A 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHR 

SRAWTCYLAIRMLMATCCPSPTTTACTGPWQRA 

PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 

PVAPLRTRPPLLISLPQDFRQVSSVIDVDLLPETH 

RRVRLHKHGSDRPLGFYIRDGMSVRVAPQG\LER 

VPGBFISRLVRGGLAESTGLLAVSDEILEVNGIEV 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 


Predicted end 

nucleotide 

location 

rniT^c nn n H i fi o 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^'Aspartic Add, 
£=Glutamic Acid, F=PhenyIaIanlne, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=>A^n9raf ine P^Proline 0=Glutainine> RBAr^inine. S^^Serine. 

1^ /&3LliXI OgliiVy M. M ft VIJllVj \^ XJftU Iftlftli M\ /^l gftlftlUwy h#VI lUwy 

T^Threonine, V=VaIine, W=Tryptophan, Y«=Tyrosinc, 
X^'Unknown, *=Stop codon, /=posslble nadeotide ddetion, 
V=possible nucleotide insertion 










AGKTLNQVTDMMVANSHhALIVTVKPANQRNN 
VVRGASGRLTGPPSAGPGPAEPDSDDDSSDLVIE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


3505 


A 


3 


2898 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELIDNAYDPDV 

NAKQIWIDKTVINDHICLTFTDNGNGMTSDKLH 

KMLSFGFSDKVTMNGHVPVGLYGNGFKSGSM\R 

LGKDAIVFTKNGESMSVGLLSQTYLVEVIKAEHV 

VVPIVAFNKHRQMINLAESKASLAAILEHSLFSTE 

QKLLAELDAIIGKKGTRinWNLRSYKNATEFDFE 

KDKYDIRIPEDLDEITGKKGYKKQERMDQIAPES 

DYSLRAYCSILYLKPRMQnLRGQKVKTQLVSKS 

LAYIERDVYRPKFLSKTVRITFGFNCRNKDHYGI 

MMYHRNRLIKAYEKVGCQLRANNMGVGVVGn 

ECNFLKPTHNKQDFDYTNEYRLTITALGEKLND 

YWNEMKVKKNTEYPLNLPVEDIQKRPDQTWVQ 

CDACLKWRKLPDGMDQLPEKWYCSNNPXDPQFR 

NCEVPEEPEDEDLVHPTYEKTYKKTNKEKFRIRQ 

PEMIPRINAELLFRPT\ALSTPS\FSSPKESVSKR/RH 

LSEGTNSYATRLLNNHQVPPQSEPESNSLKRRLS 

TRSSILNAKNRRL\SSQF\ENSVYKG\DDDDEDVII 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

LVVKKEETVEDEIDVRNDAVILPSCVEAEAKIHE 

TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

KRQCHMFTDQIKVLQQRILEMNDKYVKKETCH 

QSTETDAVFLLESINGKSESPDHMVSQYQQALEE 

lERLKKOCSALOHVKAECSOCSNNESKSEMDEM 

AVQLDDVFRQLDKCSIERDQYKSEVELLEMEKS 

QIRSQCEELKTEVEQLKSTNQQTATDVSTSSNIEE 

SVNHMDGESLKLRSLRVNVGQLLAMIVPDLDLQ 

QVNYDVDWDEILGQWEQMSEISST 


3506 


A 


2 


2120 


RPPEAGGRYRAGGRRQAAKPSRPPLPSRRRLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGGATAGSRQPSVETLDSPTGSHVEWCK 

QLIAATISSQISGSVTSENVSRDYKALRDGNKLA 

QMEEAPLFPGESIKAIVKDVMYICPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNSCGIEIVCKDMRNLRLAYK\QEEQSKLG 

IFENLNKHAFPLSNGQALFAFSYKEKFPINGWKV 

YDPVSEYKRQGLPNESWKISKINSNYEFCDTYPA 

UWPTSVKDDDLSKVAVFLAKGRVPVLSWIHPE 

SQATITRCSQPLVGPNDKRCKEDEKYLQTIMDAN 

AQSHKLIIFDARQNSVADTNKTKGGGYESESAYP 

NAELVFLEIHNIHVMRESLRKLKEIVYPSIDEARW 

LSNVDGTHWLEYIRMLLAGAVRIADKIESGKTSV 

WHCSDGWDRTAQLTSLAMLMLDSYYRTIKGFE 

TLVEKEWISFGHRFALRVGHGNDNHADADRSPIF 

LQFVDC V WQMTRQFPSAFEFNELFLITILDHLYS 

CLFGTFLCNCEQQRFKEDVYTKTISLWSYINSQL 

DEFSNPFFVNYENHVLYPVASLSHLELWVNYYV 

RWNPRMRPQMPIHQNLKELLAVRAELQKRVEG 

LQREVATRAVSSSSERGSSPSHFATSVHTLV 


3507 


A 


1 


2169 


GSSIKIRLTVLCAKNLAKKDFFRLPDPFVAKIVVD 



362 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locdtion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

<*nt***AG nnn fl I no 
COlt capuuuiiig 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine C^Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F'^Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L«Leucine, M-Methionine, 

T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X=>Unknown, *=Stop codon, A^-possible nucleotide deletion, 
^possible nucleotide insertion 










GSGQCHSTDTVKNTLDPKWNQHYDLYVGKTDSI 

TISVAVNHKKIHKKQGAGFLGCVKLLSNAISRLKD 

TGYQRLDLCKLNPSDTDAVRGQIWSLQTRDRIG 

TGGSVVDCRGLLENEGTVYEDSGPGRPLSCFME 

EPAPYTDSTGAAAGGGNCRFVESPSQDQRLQAQ 

RLRNPDVRGSLQTPQNRPHGHQSPELPEGYEQRT 

TVQGQVYFLHTQTGVSTWHDPRIPRDLNSVNCD 

ELGPLPPGWEVRSTVSGRIYFVDHNNRTTQFTDP 

RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELPA 

QRYERDLVQKLKVLRHELSLQQPQAGHCRIEVS 

REEIFEESYRQIMKMRPKDLKKRLMVKFRGEEG 

LDYGGVAREWLYLLCHEMLNPYYGLFQYSTDNI 

YMLQINPDSSINPDHLSYFHFVGRIMGLAVFHGH 

YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 

VWILENDITPVLDHTFCVEHNAFGRILQHELKPN 

G\RNVPVTEENKKEYVRLYVNWRFmGIEAQFL 

AT OKGFNFT IPOHLLKPFDOKELELTIGGLDKIDL 

NDWKSNTRLKHCVADSNIVRWFWQAVETFDEE 

RRARLLQFVTGSTRVPLQGFKALQGSTGXAAGPR 

LFTIHLIDANTDNLRKAHTCFNRIDIPPYESYEKL 

YEKLLTAVEETCGFAVE 


3508 


A 


3 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKXAGRl^GPPGNKKLIYFIDDMNMPEVD 

A YGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSHLTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGIKFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENIISNVKNEVKSQ 

GLVDNRENCWKFFIDRIRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 

KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVnSLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F==Pfaeny)a)anjne, G=Glyclne, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Metbionine, 
N=Asparagine, P— Proline, Q=^GIutaniiDe, R^Arginine, S^^Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possjb)e nucleotide insertion 


- 








LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGKWIKNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAWLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTIUDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKTTLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMNDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYVVGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEVVAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYIVVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GIITEAKLKDLTPPMPVMFIKAIPAD\RQDCGHVY 

SCPVTKTSQ\RDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3509 


A 


3 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKKLIYFroDlVD^JMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGIKFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENIISNVKNEVKSQ 

GLVDNRENCWKFFIDRIRRQLKVTLCFSPVG]NKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGffiPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 



364 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location ■ 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Aianine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Hi5tidine, 
I^'Isoleucine, K=Lysine, Lp=Lcucine, M=Methionine, 
N=Asparaginc, P^Prolinc, Q=Glutamine, R*=Arginine, S=Serine, 
T=Thrconinc, V=Vaiine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion^ 
V^possible nucleotide insertion 










KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKIAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGIKWIKNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAVVLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDBCECEYOTKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMNDLSKIIBPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEVVAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYIWAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GniEAKLKDLTPPMPVMFIKAIPADVRQDCGHVY 

SCPVTKTSQ\RDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3510 


A 


390 


3330 


AAGSGSRPPAPAARKMADLAECNIKVMCRFRPL 

NESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVF 

QSSTSQEQVYNDCAKKIVKDVLEGYNGTIFAYG 

QTSSGKTHTMEGKLHDPEGMGIIPRIVQDIFNYIY 

SMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLSV 

HEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKS 

NRHVAVTNMNEHSSRSHSIFLINVKQENTQTEQK 

LSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNIN 

KSLSALGNVISALAEGSTYVPYRDSKMTRILQDS 

LGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI 

KNWC\nWELTAEQWKKKYEKEKEKNKILRNTI 

QWLENELNRWRNGETVPIDEQFDKEKANLEAFT 

VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 

KLYKQLDDKDEEINQQSQLVEKLKTQMLDQEEL 

LASTRRDQDNMQAELNRLQAENDASKEEVKEV 

LQALEELAVNYDQKSQEVEDKTKEYELLSDELN 
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SEQ ID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lUWalJ Vil 

corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteioe, D^Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G^GIycine, H^Histidine, 
I=]soleucine, K~Lysine, L^Leucine, M~Methionine, 
N^^AsparaginC} P^^Proline, Q^GJutaminC) R^ArgininC) S^^crinC) 
T=Thrconine, V=Vaiine, W=Tryptophan, Y=Tyrosinc, 
X=lJn known, *=^top codon, /^possible nucleotide deletion^ 
V=possible nucleotide insertion 


• 








QKSATLASroAELQKLKEMTNHQKKRAAEMMA 

SLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVA 

RLYISKMKSEVKTNdVKROKQLESTQTESNJ^^ 

ENEKELAACQLRISQHEAKIKSLTEYLQNVEQKK 

RQLEESVDALSEELVQLRAQEKVHEMEKEHLNK 

VQTANBVKQAVEQQIQSHRETHQKQISSLRDEVE 

AKAKLITDLQDQNQKMMLEQERLRVEHEKLKA 

TDQEKSRKLHELTVMQDRREQARQDLKGLEETV 

AKELQTLHNLRKLFVQDLATRVKKSAEIDS\DDT 

GGSAAQKQKISFLENNLE\QLTKSAQTSWYRDNA 

DLRCELPKLEKRLRATAERVKALESALKEAKEN 

ASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQI 

AKPIRPGQHPAASPTHPSAIRGGGAFVQNSQPVA 

VRGGGGKQV 


3511 


A 


1 


1757 


MASVQASRRQWCYLCDLPKMPWAMVWDFSEA 

VCRGCVNFEGADRIELLIDAARQLKRSHVLPEGR 

SPGPPALKHPATKDLAAAAAQGPQLPPPQAQPQP 

SGTGGGVSGQDRYDRATSSGRLPLPSPALEYTLG 

SRLANGLGREEAVAEGARRALLGSMPGLMPPGL 

LAAAVSGLGSRGLTLAPGLSPARPLFGSDFEKEK 

QQRNADCLAELNEAMRGRAEEWHGRPKAVREQ 

LLALSACAPFNVRFKKDHGLVGRVFAFDATARP 

PGYEFELKLFTEYPCGSGNVYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

PPRAPSRNLAPTPRRRKASPEPEGEAAGKMTTEE 

QQQRHWVAPGGPYSAETPGVPSPIAALKl^AEA 

LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 

HRLVARNGEAEVSPTAGAEAVSGGGSGTGATPG 

APLC\CTLCRERLEDTHFVQ\CPPVPEHKFCFPCSR 

KFIKAQGPAGEWYCPSGDKCPLVGSSVPWAFMQ 

GEIATILAGDIKVKKERDP 


3512 


A 


3 


1994 


NTNSSSVTNSAAGVEDLNIVQVTVPDNEKERLSS 

lEKIKQLREQVNDLFSRKFGEAIGVDFPVKVPYR 

KITFNPGCWIDGMPPGVVFKAPGYLEISSMRRIL 

EAAEFIKFTVIRPLPGLELSNGEYSTVGKRKIDQE 

GRVFQEKWERAYFFVEVQNISTCLICKRSMSVSK 

EY>a.RRHYQThfflSKHYDQYMERMRDEKLHELK 

KGLRKYLLGLSDTECPEQKQVFANPSPTQKSPVQ 

PVEDLAGNLWEKLREKIRSFVAYSIAIDEITDINN 

TTQLAIFIRGVDENFDVSEELLDTVPMTGTKSGN 

EIFSRVEKSLICNFCINWSKLVSVASTGTPPMVDA 

NNGLVTKLKSRVATFCKGAELKSICCIIHPESLCA 

Q\KLKMDHVlvn3VWKSV>WICSRGLNHSEm 

LYELDSQYGSLLYYTEIKWLSRGLVLKRFFESLE 

EIDSFMSSRGKPLPQLSSIDWIRDLAFLVDMTMH 

LNALNISLQGHSQIVTQMYDLIRAFLAKLCLWET 

HLTRNNLAHFPTLKLVSR>IESDGLNYIPK1AELK 

TEFQKRLSDFKLYESELTLFSSPFSTKIDSVHEELQ 

MEVIDLQCNTVLKTKYDKVGIPEFYKYLWGSYP 

KYKIfflCAKILSMFGSTYICEQLFSIMKLSKTKYC 

SQLKDSQWDSVLHIAT 


3513 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 
GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Incafinh 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C~Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Giycine, H^^^Histidine, 
I»Iso]eucine, K«Lysinc, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R='Arginine, S^^Scrinc, 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
\»possibie nucleotide insertion 










LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

lAEGTSISEMWQNDLQPLLIERYFGSFGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNn 

STLNPTAKRHLVLACHYDSKYFSHWVNNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 

HL 


3514 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 

DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 

GPG>fPLPDRLGEMAGGRHRRWGTLHLLLLVAA 

LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

lAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNII 

STLNPTAKRHLVLACHYDSKYFSHW\NNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 

HL 


3515 


A 


114 


754 


LCRDLTTTMSSKJITKTKTKKRPQRATSI^FAMF 

DQSQIQEFKEAFNMIDQNRDGFIDKEDLHDMLAS 

LGKNPTDEYLDAMMNEAPGPINFTMFLTNIFGEK 

LNGTDPEDVIRNAFACFDEEATGTIQEDYLRELL 

TTVMGDRF\TDE\EVDELYREAP1\DKKGGIFNY1\E 

FTRHLETGGPKDKDDRKITFQIPSPNVPWLATFG 

VFLEIFLLHGP 


3516 


A 


1 


5169 


MAAAPSALLLLPPFPVLSTYRLQSRSRPSAPETDD 

SRVGGIMRGEKNYYFRGAAGDHGSCfTTTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRLLQL 

LRTARDPSEAFQALQAALPRRGGRLGFPRRKEAL 

YRALGRVLVEGGSDEKRLCLQLLSDVLRGQGEA 

GQLEEAFSLALLPQLVVSLREENPALRKDALQIL 

HICLKRSPGEVLRTLIQQGLESTDARLRASTALLL 

PILLTTEDLLLGLDLTEVnSLARKLGDQETEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN 

RRLESQFGSQVPYYLELEASGFPEDPLPCAVTLS 

NSNLKFGIIPQELHSRLLDQEDYKNRTQAVEELK 

QVLGKFNPSSTPHSSLVGFISLLYNLLDDSNFKVV 

HGTLEVLHLLVIRLGEQVQQFLGPVIAASVKVLA 

DNKLVIKQEYMKIFLKLMKEVGPQQVLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSEDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RLTEQGFVEYAVLMPSSAGGRSNHLAHGADTD 

WLLAGNRTQSAHCHCGDHVRDSMfflYGSYSPTI 

CTRRVLSAGKGKNKLPWENEQPGIMGENQTSTS 

KDIEQFSTYDFIPSAKLKLSQGMPVNDDLCFSRK 

RVSRNLFQNSRDFNPDCLPLCAAGTTGTHQTNLS 

GKCAQLGFSQICGKTGSVGSDLQFLGTTSSHQEK 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E>=Glutaniic Acid, F==Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N^'Asparagine, P==Proline, Q=Glutamine, R^Ai^inine^ S^Stlint, 
T=Threonine, V=Valine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nudeotide ddetton, 
V=possible nudeoUde insertion 










VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 

ILPSYPVSSPRTSPKHTSPLIISPKKSQDNSVNFSNS 

WPLKSFEGLSKPKSHRRSLSAQKSSVDPtGRVNHG 

VENSQEKPPWQLTPALWRSPSSRRGLNGTKPVPPI 

P\RGISLLPDKADLSTVGHKKKEPDDIWKCEKDS 

LPJDLSELNFKDKDLDQEEMHSSLRSLRNSAAKK 

RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 

NSYSESGVYSQESLTSSLSTTPQGKRIMSDIFPTFG 

SKPCPTRLSSAKKKISHIAEQSPSAGSSSNPQQISS 

FDFTTTKALSEDSVVVVGKGVFGSLSSAPATCSQ 

SVISSVENGDTFSIKQSIEPPSGIYGRSVQQNISSYL 

DVENEKDAKVSISKSTYNKMRQKRKEEKELFHN 

KDCEKKEKNSWERMRHTGTEKMASESETPTGAI 

SQYKERMPSVTHSPEIMDLSELRPFSKPEIALTEA 

LRLLADEDWEKKIEGLNFIRCLAAFHSEILNTXL 

HETNFAVVQEVKNLRSGVSRAAWCLSDLFTYL 

KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 

LRAMVNNVTPARAVVSLINGGQRYYGRKMLFF 

MMCHPNFEKMLEKYVPSKDLPYIKDSVKNLQQK 

GLGEIPLDTPSAKGRRSHTGSVGNTRSSSVSRDA 

FNSAERAVTEVREVTRKSVPRNSLESAEYLKLIT 

GLLNAKDFRDRINGIKQLLSDTENNQDLVVGNIV 

KIFDAFKSRLHDSNSKVNLVALETMHKMIPLLRD 

HLSPIINMLIPAIVDNNLNSKNPGIYAAATNVVQA 

LSQHVDNYLLLQPFCTKAQFLNGKAKQDMTEKL 

ADIVTELYQRKPHATEQKVLVVLWHLLGNMTN 

SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 

AASQPPHIKKSLEELLDMTILNEL 


3517 


A 


1449 


252 


QDLKPVLDREYLAIYLKMVFFTCNACGESVKKI 

QVEKHVSVCRNCECLSCIDCGKDFWGDDYKNH 

VKCISEDQKYGGKGY/EKVKIHKGD/ASKQQAW 

IQKISELIK\RPNVSPKVRELLEQISAFDNVPQ\KK 

AKFQNWMKNSLKVHhreSILDQVWNIFSEASNSE 

PVNKEQDQKPLHPVANPHAEISTKVPASKVBCDA 

VEQQGEVKKNKRERKEERQKKRKREKKELKLE 

NHQENSRNQKPKKRKKGQEADLEAGGEEVPEA 

NGSAGKRSKKKKQRKDSASEEEARVGAGKRKR 

RHSKVETDSKKKKMKLPEHPEGGEPEDDEAPAK 

GKFNWKGTIKAILKQAPDNEITIKKLRKKVLAQY 

YTVTDEHHRSEEELLVIFNKKISKNPTFKLLKDK 

VKLVK 


35IS 


A 


3 


635 


APDSNARNDHFDACSLRVQAGLSSAGPALGNSG 

LAALMASPSKAVrVPGNGGGDVTTHGWYGWVK 

KELEKIPGFQCLAKNMPDPITARESIWLPFMETEL 

HCDEKTIIIGHSSGAIAAMRYAETHRVYAIVLVSA 

YTSDLGDENERASGYFTRPWQWEKIKANCPYIV 

QFGSTDDPFLPWKEQQEVADXSWKPNCTNSLTV 

ATFRTQSFMN 


3519 


A 


81 


2277 


VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

YVKQLSQQSDGDRDLQEHRQRIQALAEETAQNL 

KRNVYQNYRQFIETAREISYLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASGGEEGVGGA 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPGQYLVYNGDLVEYD 

ADHMAQLQRVHGFLMNDCLLVATWLPQRRGM 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inratinrt 

IV^a null 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
rnrre^nondins 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G=G!ycine, H^Histidlne* 
I-Isoleucine, K=Lysine, l;=Lcucine, M-Methionine, 
N=Asparagine, P=Proline, Q=^Glutamine, R=Argininei S^erine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *==Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










YRYNALYSLDGLAVVNVia)NPPMKDMFKLLMF 

PENRIFQAENAKKREWLEVLEDTKRALSEKRRR 

EQEEAAAPRGPPQVTSKATNPFEDDEEEEPAVPE 

VEEEKVDLSMEWIQELPEDLDVCIAQRDFEGAV 

DLLDKLNHYLEDKPSPPPVKELRAKVEERVRQL 

TEVLVFELSPDRSLRGGPKATRRAVSQLIRLGQC 

TKACELFLRNRAAAVHTAIRQLRIEGATLLYIHK 

LCHVFFTSLLETAREFEIDFAGTDSGCYSAFVVW 

ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 

KEHCQQLGDIGLDLTFIIHALLVKDIQGALHSYK 

EIIIEATKHRNSEEMWRRMNLMTPEALGKLKEE 

MKSCGVSNFEQYTGDDCWVNLSYTVVAFTKQT 

MGFLEEALKLYFPELHMVLLESLVEIILVAVQHV 

DYSLRCEQDPEKKAFIRQNASFLYETVLNPVVEK 

RFEEGVGKPAKQLQDLRNASRLIRVNPESTTSVV 


3520 


A 


1706 


540 


FVAHLAWPWRADGDMEDGVLNEGFLVKRGfflV 

hnwkarwfilrqntlvyykleggrrvtppkgri 

lldgctitcpcleyenrplliklktqtsteyflea 

csree/rrdawafe\itgaihagqargkvqqlhs 

lrnsfklppfflslhrivdkmhdsntgirsspnmeq 

gstykktflgsslvdwlisnsftasrleavtlas 

mlmeenflrpvgvrsmgairsgdlaeqflddst 

alytfaesykkkispkeeislstvelsgtvvkogy 

lakqghkrknwkvrrfvlrjkdpaflhyydpsk 

eenrpvggfslrgslvsaledngvptgvkgnvq 

gnlfkvitk\ddthyyiqa\sskae\rae\wigsls 

kslnmnkdpegtpdslpslpr 


3521 


A 


3 


3063 


HASVSLSLGCPRPCADTPGPQPQPMDLRVGQRPP 

VEPPPBPTLLALQRPQRLHHHLFLAGLQQQRSVE 

PMRVKMELPACGATLSLVPSLPAFSIPRHQSQSST 

PCPFLGCRPCPQLSMDTPMPELQEAPQEQELRQL 

LHKDKSICRSAVASSVVKQKLAEVILKKQQAALE 

RTVHPNSPGIPYRTLEPLETEGATRSMLSSFLPPV 

PSLPSDPPEHFPLRKTVSEPNLKLRYKPKKSLERR 

KNPLLRKESAPPSLRRRPAETLGDSSPSSSSTPAS 

GCSSPNDSEHGPNPILGSEALLGQRLRLQETSVAP 

FALPTVSLLPAITLGLPAPARADSDRRTHPTLGPR 

GPILGSPHTPLFLPHGLEPEAGGTLPSRLQPILLLD 

PSGSHAPLLTVPGLGPLPFHFAQSLMTTERLSGSG 

LHWPLSRTOSEPLPPSATAPPPPGPMQPRLEQLKT 

HVQVKRSAKPSEKPRLRQIPSAEDLETDGGGPG 

QVVDDGLEHRELGHGQPEARGPAPLQQHPQVLL 

WEQQRLAGRLPRGSTGDTVLLPLAQGGHRPLSR 

AQSSPAAPASLSAPEPASQARVLSSSETPARTLPF 

TTCLIYDSVMLKHQCSCGDNSRHPEHAGRIQSIW 

SRLQERGLRSQCECLRGRKASLEELQSVHSERHV 

LLYGTNPLSRLKLDNGKLAGLLAQRMFVMLPCG 

GVGVDTDTIWNELHSSNAARWAAGSVTDLAFK 

VASRELKNGFAVVRPPGHHADHSTAMGFCFFNS 

VAIACRQLQQQSKASKILIVDWDVHHGNGTQQT 

FYQDPSVLYISLHRHDDGNFFPGSGAVDEVGAGS 

GEGFNVNVAWAGGLDPPMGDPEYLAAFRIWM 

PIAREFSPDLVLVSAGFDAAEGHPAPLGGYHVSA 

KCFGYMTQQLMNLAGGAVVLALEGGHDLTAIC 

DASEACVAALLGNRVDPLSEEGWKQKPNLNAIR 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucJeotJde 

location 

corresponding 

to first amino 

acid residue of 

pcpuac 

sequence 


Predicted end 
nucleotide ( 
location 
corresponding 
to last amino 
acid residue of 
peptide 


Amino acid sequence (A=Alanine CX^ysteine, D=A$partic Acid, 
E>Glutaniic Acid, F=Phenylalanine, G=Glycine, H=IIistidine, 
I=:lsoleucine, K^^Lysine, I/^Lcucine, M^-Metbionine, 
N=Asparagine, P=ProliDe, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *^top codon, A^possiUe nadeolide deletion, 
^possible nucleotide insertion 










SLEAWIRVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 


3522 


A 


9 


602 


KMAALGEPVRLERDICRAIELLEKLQRSGEVPPQ 
KLQALQRVLQSEFCNAVREVYEHVYETVDISSSP 
EVRANATAKATVAAFAASEGHSHPRWELPKTE 
EGLGFNIMGGKEQNSPIYISRIIP/GGIADRHGGLK 
RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 
KLVVRYTPKVLEEMESRFEKMRSAKRRQQT 


3523 


A 


645 


1465 


IMAETSLLEAGASAASTAAALENLQVEASCSVCL 

EYLKEPVIIECGHNFCKACITRWWEDLERDFPCP 

VCRKTSRYRSLRPNRQLGSMVEIAKQL\RPSSGRS 

GMRASAPQHHEALSLFCYEDQEAVCLICAISHTH 

RAHTWPLDDATQEYKEKLQKCLEAVLNQKLQEI 

TRCKSSEEKKPGELKRLVESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDILQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 


3524 


A 


3 


698 


PMVRHEAGEALGAIGDPEVLEBLKQYSSDPVIEV 

AETCQLAVRRLEWLQQHGGEPAAGPYLSVDPAP 

PAEERNDVGRLREALLDESRPLFERYRAMFALRN 

AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 

HEAAVPQLAAALARCTENPMVRHECAEALGAIA 

RPACLAALQAHADDPERWRE\SCKVALDMYEH 

ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 


3525 


A 


1452 


694 


EGLQRPEYLVASAAGFQGLAWGGEGRGRAGCS 

SSGFRDAEPLLLSCPGRNEPLKKERLKWKSDYP 

MTDGQLRSKRDEFWDTAPAFEGRKEIWDALKA 

AAYAAEANDHELAQAILDGASITLPHGTLCECY 

DELGNRYQLPIYCLSPPVNLLLEHTEEESLEPPEP 

PPSVRREFPLKVRLSTGKDVRLSASLPDTVGQLK 

RQLHAQE/GTPKPSWQRWFFSGKLLTDRTRLQET 

KIQKDFVIQVIINQPPPPQD 


3526 


A 


123 


3441 


PGNEGLGLAADHNEDLGHLSADAPWPAVTMAP 

RKRSHHGLGFLCCFGGSDffEINLRDNHPLQFME 

FSSPIPNAEELNIRFAELVDELDLTDKNREAMFAL 

PPEKKWQIYCSKKKEQEDPNKLATSWPDYYIDRI 

NSMAAMQSLYAFDEEETEMRNQWEDLKTALR 

TQPMRFVTRFIELEGLTCLLNFLRSMDHATCESRI 

HTSLIGCIIALMNNSQGRAHVLAQPEAISTIAQSL 

IRTENSKTKVAVLEILGAVCLVPGGHKKVLQAML 

HYQVYAAERTRFQTLLNELDRSLGRYRDEVNLK 

TAIMSFINAVLNAGAGEDNLEFRLHLRYEFLMLG 

IQPVIDKLRQHENAILDKHLDFFEMVRNEDDLEL 

ARRFDMVHIDTKSASQMFELIHKKLKYTEAYPC 

LLSVLHHCLQMPYKRNGGYFQQWQLLDRILQQI 

VLQDERGVDPDLAPLENFNVKNIVNMLINENEV 

KQWRDQAEKFRKEHMELVSRLERKERECETKTL 

EKEEMlyiRT\LNKMKDKLARESQELRQARGQVA 

ELVAQLSELSTGPVSSPPPPGGPLTLSSSMTTNDL 

PPPPPPLPFACCPPPPPPPLPPGGPPTPPGAPPCLG 

MGLPLPQDPYPSSDVPLRKKRVPQPSHPLKSFNW 

VKLNEERVPGTVWNEIDDMQVFRILDLEDFEKM 

FSAYQRHQELITNPSQQKELGSTEDIYLASRKVK 

ELSVIDGRRAQNCIILLSKLKLSNEEIRQAILKMD 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
'sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^^'Glutamic Acid, F==Phenylalanine, G^Glycine, H^Histidine, 
Msoleucf ne, K^Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, ^^Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyroslne, 
X=Unkno>vn, *=Stop codon, /^possible nucleotide deletlon» 
\=possible nucleotide insertion 










EQEDLAKDMLEQLLKFIPEKSDIDLLEEHKHEIER 

MARADRFLYEMSRIDHYQQRLQALFFKKKFQER 

LAEAKPKVEAILLASRELVRSKRLRQMLEVILAI 

GNFMNKGQRGGAYGFRVASLNKIADTKSSIDRN 

ISLLHYLIME-EKHFPDILNMPSELQHLPEAAKVN 

LAELEKEVGNLRRGLRAVEVELEYQRRQVREPS 

DKFVPVMSDFITVSSFSFSELEDQLNEARDKFAK 

ALMHFGEHDSKMQPDEFFGIFDTFLQAFSEARQD 

LEAMRRRKEEEERRARMEAMLKEQRERERWQR 

QRKVLAAGSSLEEGGEFDDLVSALRSGEVFDKD 

LCKLKRSRKRSGSQALEVTRERAINRLNY 


3527 


A 


1445 


714 


LLGTRMLAGQLEARDPKEGTHPEDPCPGAGAV 

MEKTAVAAEVLTEDCNTGEMPPLQQQnRLHQE 

LGRQKSLWADVHGKLRSHIDALREQNMELREKL 

RALQLQRWKARKKSAASPHAGQESHTLALEPAF 

GKISPLSADEETIPKYAGHKN\QSGHSSWGQRSSS 

NNSAPPKPMSLKIERISSWKTPPQENRDKNLSRR 

RQDRRATPTGRPTPCAERRGWSEDGKVASDTCV 

TLHWPLGKFRFR 


3528 


A 


484 


1777 


RISKIQVYYSTGYSSRKMNPTLGLAIFLAVLLTVK 

GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 

QNMDLGFKLLKKLAFYNPGRNTIFLSPLSISTAFS 

MLCLGAQDSTLDEIKQGFNFRKMPEKDLHEGFH 

YIIHELTQKTQDLKLSIGNTLFIDQRLQPQRKFLE 

DAKNFYSAETILTNFQNLEMAQKQINDFI/ESKTH 

GKINNLffiNroPGTVMLLA>mFFRARWKHEFDP 

NVTKEEDFFLEKNSSVKVPMMFRSGIYQVGYDD 

KLSCTILEIPYQKNITAIFILPDEGKLKHLEKGLQV 

DTFSRWKTLLSRRVVDVSVPRLHMTGTFDLKKT 

LSYIGVSKIFEEHGDLTKIAPHRSLKVGEAVNKA 

ELKMDERGTEGAAGTGAQTLPMETPLVVKIDKP 

YLLLIYSEKIPSVLFLGKIVNPIGK 


3529 


A 


1 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETUQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYHQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKIORMEAHAKJ^AVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETffMVVSDFDLPDQQffilLQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

qvvfdlickvvsglevesasvtsqleffiamppkc 
sdidpdeetikieddsiqqsqnallsnessqflsvs 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 
IETKSRQRSHSSIQFSFKEKLSEKVSEKETTVKESG 
KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 
FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F<=Phenylalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, D=Leucine, M-Methionine, 
N^'Asparagine, P=ProIine, Q^GIutamine, R=Arginlne, S=Senne, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 




• 






NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 
TWIAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 
SVMGKDFYSHIPVDSNHNFRSSMYffilLISLCLYY 
MRSHYPTHVKVTAQDLIGNimMQMMSIEILTLL 
FTELAKVffiSSAKGFPSFISDMLSKCKVQKVILHC 
LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 
NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 
IPEE\NETGFDFVVS\DLEHISPHQPMTSLQYLHAQ 
SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 
STLPYMGKVLQRWVSVTLQLCRNLDNLIQQYK 
YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 
LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 
MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 
. ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 
ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 
RAETVIQWKEVLKQPPAIAKDKKHLSLEVCML 
QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 
APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 
HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 
VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 
TLLSEVLAHLLDMVFYSDEKERVIPLLVNIMHYV 
VPYLKKHSAHNAPSYRACVQLLSSLSGYQYTRR 
AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 
LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 
LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 
ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 
WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 

glettytggngfstsynsqrwlnlylsackfld 
lalalpsenlpqfqmyrwafipeasddsglevrr 

QGfflQREFKPYVVRLAKLLRKRAKKNPEEDNSG 

rtlgwepghllltictvrsmeqllpffnvlsqvf 

nskvtsrcgghsgspilysnafpnkdmklenhkp 

csskarqkieemvekdflegmikt 


3530 


A 


1 


5684 


vssvshenptevfedgenppssrssesgftefiqy 

qadrtddidrelsegqgaaaipigstssetetast 

vgseetiiqtpsvvtqgtatrsrktaqktamqcc 

leyvqqfltrlinlyiiqnnsfsqslatehqgdlg 

reqgetskwdrnsqgdvkekniskqktskeyls 

aflaacqlflecssfpvyiaegnhtselrseklet 

dcehvqppqwlqtlmnacsqasdfsvqsvaisl 

VMDLVGLTQSVAM VTGENINS VEPAQPLSPNQG 

rvavvirppltqgnlryiaekteffkhvaltlwd 

qlgdgtpqhhqksvelfyqlhnlvpsssicedvi 

sqqlthkdkkirmeahakfavlwhltrdlhink 

sssfvrsfdrslfimldslnsldgstssvgqawl 

nqvlqrhdiarvlepllllllhpktqrvsvqrv 

qaerywnkspcypgeesdkhfmqnfacsnvsq 

vqlitskgngekpltmdeienfsltvnplsdrlsl 

lstssetipmvvsdfdlpdqqieilqssdsgcsqss 

agdhn.syevdpetvnaqedsqmpkesspdddvq 

qvvfdlickvvsglevesasvtsqleffiamppkc 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 
AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 
IETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 
KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

^Ai*i*pcnnnH 1 no 
c 1/ r rc9|/u 11 u J 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F^Phenylalanine, G^Glycine, H'^Histidine, 
I^lsoleucine, K^Lysine, L^Leucine, M^Methionine, 

T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyroslne, 
X>=UnkDOwn, *=5top codon, /=possible nucleotide deletion, 
\FpossibIe nudeodde insertion 










FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 

NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

TM>IAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSHPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDLIGNRNMQMMSIEILTLL 

FTELAKVIESSAKGFPSnSDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 

IPEE\NETGFDFWS\DLEHISPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRA\LHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVWSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLLSEVLAHLLDMVFYSDEBCERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPOFOMYRWAFIPEASDDSGLEVRR 

QGIHQREFKPYVVRLAKLLRKRAKKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 


3531 


A 


553 


2470 


LISPSPALSSQDPALSLKENLEDISGWGLPEARSK 

ESVSFKDVAVDFTQEEWGQLDSPQRALYRDVM 

LENYQNLLALGPPLHKPDVISHLERGEEPWSMQ 

REVPRGPCPEWELKAVPSQQQGICKEEPAQEPIM 

ERPLGGAQAWGRQAGALQRSQAAP\GR\RTCHG 

LGRPWEEFPLRCPLFAQQRVPEGGPLLDTRKNV 

QATEGRTKAPARLCAGENASTPSEPEKFPQVRRQ 

RGAGAGEGEFVCGECGKAFRQSSSLTLHRRWHS 

REKAYKCDECGKAFTWSTNLLEHRRIHTGEKPFF 

CGECGKAFSCHSSLNVHQRIHTGERPYKCSACEK 

AFSCSSLLSMHLRVHTGEKPYRCGECGKAFNQR 

THLTRHHRIHTGEKPYQCGSCGKAFTCHSSLTVH 

EKIHSGDKPFKCSDCEKAFNSRSRLTLHQRTHTG 

EKPFKCADCGKGFSCHAYLLVHRRIHSGEKPFKC 

NECGKAFSSHAYLIVHRRIHTGEKPFDCSOCWKA 

FSCHSSLP/HQRIHTGEKPYKCSECGRAFSQNHCL 

nCHQKIHSGEKSFKCEKCGEMFNWSSHLTEHQRL 

HSEGKPLAIQFNKHLLSTYYVPGSLLGAGDAGLR 

DVDPBDALDVAKLLCVVPPRAGRNFSLGSKPRN 


3532 


A 


3931 


317 


HRELQDSPSAEPPAGSMPLRHWGMARGSKPVGD 
GAQPMAAMGGLKVLLHWAGPGGGEPWVTFSES 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

InroHnn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

CO r res po nd i ng 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Pbenylalanine, G=GIycine, H=Histid]ne, 
I=Isoleucine, K^Lysine^ I/^Leucine, M=Methionine, 
N^^Asparagine^ P^ProlinCj Q^Glutaminey R=Arginine» S^^crlDCy 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










sltaeevcihiahkvgitppcfnlfalfdaqaqv 

mppnhileiprdaslmlyf\rhrfysr\nwhgm 

nprepavyrcgppgteassdqtaqgmqlldpas 

feylfeqgkhefvndvaslwelsteeeihhfkne 

slgmaflhlchlalrhgipleevakktsfkdcip 

rsfrrhirqhsaltrlrlrnvfrrflrdfqpgrls 

qqmvmvkylatlerlaprfgtervpvchlrlla 

qaegepcyirdsgvaptdpgpesaagppthevlv 

tgtggiqwwpveeevnkeegssgssgrnpqasl 

fgkkakahkafgqpadrpreplgayfcdfrdit 

hvglkehcvsihrqdnkclelslpsraaalsfvs 

lvdgyfrltadsshylchevapprlvmsirdgih 

gpllepfvqaklrpedglylihwstshpyrliltv 

aqrsqapdgmqslrlrkfpieqqdgafvlegwg 

rsfpsvrelgaalqgcllragddcfslrrcclpq 

pgetsnlumrgarasprtlnlsqlsfhrvdqkei 

tqlshlgqgtrtnvyegrlrvegsgdpeegkmd 

dedplvpgrdrgqelrvvlkvldpshhdialaf 

yetaslmsqvshthlafvhgvcvrgpenimyte 

yvehgpldvwlrrerghvpmawkmvvaqqla 

salsylenknlvhgnvcgrnillarlglaegtsp 

fiklsdpgvglgalsreerverjpwlapeclpgg 

anslstamdkwgfgatlleicfdgeaplqsrsps 

eki:hfyqrqhrlpepscpqlatltsqcltyeptq 

rpsfrte.kdltrlqphnladvltvnpdspasdpt 

vfhkrylkkirdlgeghfgkvslycydptndgt 

gemvavkalkadcgpqhrsgwkqeidilrtlyh 

ehiikykgccedqgekslqlvmeyvplgslrdyl 

prhsiglaqlllfaqqicegmaylhaqhyihrdl 

aarnvlldndrlvkigdfglakavpegheyyrv 

redgdspvfwyapeclkeykfyyasdvwsfgvt 

lyellthcdssqspptkfleligiaqgqmtvlrlt 

ellergerlprpdkcpcevyhlmkncweteasf 

rptfenlipilktvhekyqgqapsvfsvc 


3533 


A 


182 


3465 


frwldffrgsinsqfefgrkkenmtspakfkkdk 

eiiaeydtqvkeiraqlteqmkcldqqcelrvql 

lqdlqdffrkkaeiemdysrnleklaerflakt 

rstkdqqfkkdqnvlspvncwnlllnqvkres 

rdhttlsdiylnniiprfvqvsedsgrlfkkskev 

gqqlqddlmkvlnelysvmktyhmynadsisa 

qsklkeaekqeekqigksvkqedrqtprspdsta 

nvrmekhvrrssvkkiekmkekrqakytenkl 

kaikarneyllaleatnasvfkyyihdlsdlidq 

ccdlgyhaslnralrtflsaelnleqskhegld 

aienavenldatsdkqrlmemynnvfcppmkfe 

fqphmgdmasqlcaqqpvqsellqrclqlqsrl 

stlkieneevkktmeatlqtiqdivtvedfdvsd 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQFYFTKMKEYLEGRNLITKLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAIPLWESCIR 

FISRHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

LAGDQNDHDMDSIAGVLKLYFRGLEHPLFPKDIF 

HDLMACVTMDNLQERALHIRKVLLVLPKTTLn 

MRYLFAFLNHLSQFSEENMMDPYNLAICFGPSL 

MSVPEGHDQVSCQAHVNELIKTniQHENlFPSPRE 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine 0=Cystcinc, D=Aspartic Acid, 
E^'Glutamic Acid, F-Phenylalanine, G^GIycine, H=Histidine, 
I-lsoleucine, K=Lysine, U=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serinc, 
T=Threoninc, V=Valine, W=TryptQphan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
X^possible nucleotide insertion 










LEGPVYSRGGSMEDYCDSPHGETTSVEDSTQDV 

TAEHHTSDDECEPIEAIAKFDYVGRTARELSFKK 

GASLLLYQRASDDWWEGRHNGIDGLIPHQYIW 

QDTEDGVVERSSPKSEIEVISEPPEEKVTARAGAS 

CPSGGHVADIY1J^.NINKQRKRPESGSIRKTFRSDS 

HGLSSSLTDSSSPGVGASCRPSSQPIMSQSLPKEG 

PDKCSISGHGSLNSISRHSSLKNRLDSPQIRKTAT 

AGRSKSFDNHRPMDPEVIAQDIEATMNSALNELR 

ELERQSSVKHTPDWLDTLEPLKTSPVVAPTSEPS 

SPLHTQLLKDPEPAFQRSASTAGDIACAFRPVKS 

VKMAAPVKPPAT\RPKPT\VFPKTNATSPGVNSST 

SPQSTDKSCTV 


3534 


A 


1 


2640 


FRRFVCPASRRPAAGLRDAASSAPRGMASEGPRE 

PESEGIKLSADVKPFVPRFAGLNVAWLESSEACV 

FPSSAATYYPFVQEPPVTEQKIYTEDMAFGASTFP 

PQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQY 

LYNQPSCYRGFQTVKHRNENTCPLPQEMKALFK 

KKTYDEKKTYDQQKFDSERADGTISSEIKSARGS 

HHLSIYAENSLKSDGYHKRTDRKSRIIAKNVSTS 

KPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVH 

SVSTDISLLREVVKPAAVLSKGEIWKNNPNESV 

TANAATNSPSCTRELSWTPMGYVVRQTLSTELS 

AAPKNVTSMINLKTIASSADPKNVSIPSSEALSSD 

PSYNKEKHIIHPTQKSKASQGSDLEQNEASRKNK 

KKKEKSTSKYEVLTVQEPPRIEDAEEFPNLAYAS 

ERRDRIETPKFQSKQQPQDNFKNNVKKSQLPVQL 

DLGGMLTALEKKQHSQHAKQSSKPVWSVGAV 

PVLSKECASGERGRRMSQMKTPHNPLDSSAPLM 

KKGKQREIPKAKKPTSLKKIILKERQERKQRLQE 

NAVSPAFTSDDTQDGESGGDDQFPEQAELSGPEG 

MDELISTPSVEDKSEEPPGTELQRDTEASHLAPN 

HTTFPKIHSRRFRDYCSQMLSKEVDACVTDLLKE 

LVRFQDRMYQKDPVKAKTKRRLVLGLREVLKH 

LKLKKLKCVIISPNCEKIQSKGGLDDTLHTIIDYA 

CEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSY 

DGAQDQFHKMVELTVAARQAYKTMLENVQQE 

LVGEP\SLRHLPAYPHRAPAALQKMAPQP/VKEK 

EEPHYIEIWKKHLEAYSGCTLELEESLEASTSQM 

MNLNL 


3535 


A 


1747 


983 


LFQFQVCRSVLSPRAAGCTWSLAPRSRGAAGSPR 

RYRGPQPQPAPPSALPNSPa>SPVASGREMVVLSV 

PAEVTVILLDIEGTTTPIAFVKDILFPYIEENVKEY 

LQTHWEEEECQQDVSLLRKQVXFADVVPAVRKW 

REAGMKVYIYSSGSVEAQKLLFGHSTEGDILELV 

DGHFDTKIGHKVESESYRKIADSIGCSTNNILFLT 

DVTREASAAEEADVHVAVWRPGNAGLTDDEK 

TYYSLITSFSELYLPSST 


3536. 


A 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTS 

lESRGRPAASAGLRRDRCALRRWPLRRAPLARAT 

RRRAGSPRRCAPRPRACPQGWSRARHQPGGLCL 

LLLLLCQFMEDRSAQAGNCWLRQAKNGRCQVL 

YKTELSKEECCSTGRLSTSWTEEDVNDNTLFKW 

MIFNGGAPNCIPCKETCEl^mDCGPGKKCRMNK^ 

NKPRCVCAPDCSNITWKGPVCGLDGKTYRNECA 

LLKARCKEQPELEVQYQGRCKKTCRDVFCPGSS 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=:Alanine C=Cysteine, I>:Aspartic Acid, 
E=Glutamic Acid, F=Phenylalaninc, G=Glycine, H=Histidiney 
Msoleucine, K«Lysine, Lr=Leucine, M»Methionine, 
N-Asparagine, P=ProIine, Q^lutamine, R=Arginine, S^erine, 
T=Thrconlne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=ppssible nucleotide insertion 










TCVWDQTNNAYCVTCNRICPEPASSEQYLCGND 
GVTYS\SACHLRKATCLLGRSIGLAYEGKCIKAK 
SCEDIQCTGGKKCLWDFKVGRGRCSLCDELCPD 
SKSDEPVCASDNATYASECAMKEAACSSGVLLE 
VKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


3537 


A 


285 


2123 


IGLFLQVAPLSVMAKSCPSVCRCDAGFIYCNDRF 

LTSIPTGIPEDATTLYLQNNQINNAGIPSDLKNLL 

KVERIYLYHNSLDEFPTNLPKYVKELHLQENNIR 

TITYDSLSKIPYLEELHLDDNSVSAVSIEEGAFRD 

SNYLRLLFLSRNHLSTIPWGLPRTIEELRLDDNRIS 

TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 

NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 

DNHINRVPPNAFSYLRQLYRLDMSNNNLSNLPQ 

GIFDDLDNITQLILRNNPWYCGCKMKWVRDWL 

QSLPVKVNVRGLMCQAPEKVRGMAIKDLNAELF 

DCKDSGIVSTIQITTAIPNTVYPAQGQWPAPVTK 

QPDIKNPKLTKDHQTTGSPSRKTITITVKSVTSDTI 

HISWKLALPMTALRLSWLKLGHSPAFGSITEHVT 

GERSEYLVTALEPDSPYKVCMVPMETSNLYLFD 

ETPVCIETETAPLRMYNPTTTLNREQEKEPYKNP 

NLPLAAIIGGAVALVTIALLALVCWYVHRNGSLF 

SRNCAYSKGRPOIKDDYAEAGTKKDNSILEIRETS 

FQMLPISNEPISKEEFVIHTIFPPNGMNLYKNNH . 


3538 


A 


877 


6184 


WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEMIATRKVEQDSKETVKLSHEDDHILEDAGS 

SDISSDAACTNPNKTENSLVGLPSCVDEVTECNL 

ELKDTMGIADKTENTLERNKIEPLGYCEDAESNR 

QLESTEFNKSNLEVVDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKIESHETANLQDDRNSQSSSV 

SYLESKSVKSKHTKPVIHSKQ>MTTDAPKKIVAA 

KYEVIHSKTKVNVKSVKRNTDVPESQQNFHRPV 

KVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKK 

TLQDQTLVQIFKPLTHSLSDKSHAHPGCLKEPHH 

PAQTGHVSHSSQKQCHKPQQQAPAMKTNSHVK 

EELEHPGVEHFKEEDKLKLKKPEKNLQPRQRRSS 

KSFSLDEPPLFIPDNIATIRREGSDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

LDPDTLENQATVEFHSGDKTMECEKLGLSKHTT 

NDRTKYIDDTVKHKVKILKRESGEGRNSSDCRD 

NEIKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQEMKKKKVNEKGVL 

NVHPAASASKPSADQIRQSVRHSLKDILMKRLTD 

SNLKVPEEKAAKVATKIEKELFSFFRDTDAKYKN 

KYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIR 

MSPEELASKELAAWRRRENRHTDEMIEKEQREVE 

RRPITKITHKGEIEIESDAPMKEQEAAMEIQEPAA 

NKSLEKPEGSEK\RKEEVDSMSKDTTSQHRQHLF 

DLNCKICIGRMAPPVDDLSPKKVKVWGVARKH 

SDNEAESIADALSSTSNILASEFFEEEKQESPKSTF 

SPAPRPEMPGTVEVESTFLARLNFIWKGFINMPS 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

TVWDYVEKIKASGTKEICVVRFTPVTEEDQISYT 

LLFAWSSRKRYGVAANNMKQVKDMYLEPLGAT 

DKIPHPLVPFDGPGLELHRPNLLLGLURQKLKRQ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lorflHfin 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I^Isoleucine, K^Lysine, Lr=Leucine, M=Methionine, 
N^^AsparaginC) P^Proline, Q=Glutaminej R=Arginine, S^^erinC) 
T^Threonine, V=Valinc, W=Ti7ptophan, V=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










HSACASTSHIAETPESAPPIALPPDKKSKIEVSTEE 

APEEENDFFNSFTTVLHKQRJflCPQQNLQEDLPTA 

\nSPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLE 

LANKPLPVDD1LQSLLGTTGQVYDQ\AQSVMEQ 

NTVKEIPFLNEQTNSKIEKTDNVEVTDGENKEIK 

VKVDNISESTDKSAEIETSVVGSSSISAGSLTSLSL 

RGKPPDVSTEAFLTNLSIQSKQEETVESKEKTLKR 

QLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGN 

VSCSENLVANTARSPQFINLKRDPRQAAGRSQPV 

TTSESKDGDSCRNGEKHMLPGLSHNKEHLTEQIN 

VEEKLCSAEKNSCVQQSDNLKVAQNSPSVENIQT 

SQAEQAKPLQEDILMQNIETVHPFRRGSAVATSH 

FEVGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRP 

QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGFPPHL 

PPPLLPPPGFG\FA\QNPMVPWPPW\HLP\GQPQR 

MMGPLSQASRYIGPQNFYQVKDIRRPERRHSDP 

WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKR 

ERHEKEWEQESERHRRRDRSQDKDRDRKSREEG 

HKDKERARLSHGDRGTDGKASRDSRNVDKKPD 

KPKSEDYEKDKEREKSKHREGEKDRDRYHKDR 

DHTDRTKSKR 


3539 


A 


157 


1769 


GSWTVELSLKPSASPSLKWVCLPGAAAVNKHRS 

GAGGLIRSLIQCTWAPAGPARRGGRGIEDFPYLF 

FQLTHCQQRICSVTQAGVQWCDHSSLQPQTPGL 

NQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPP 

NVTWTELEDRDGRVYPHPQDLLAALPLALVLLA 

MRLAFERFIGLPLSRWLGVRDQTRRQVKPNATL 

EKHFLTEGHRPKEPQLSLLAAQCGLTLQQTQRW 

FRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGL 

SVLYHESWLWAPVMCWDRYPNQLTLSCPAADS 

EAXSLYWWYLLELGFYLSLLIRLPFDVKRKGGGP 

SSIKPRPHYDPPSTA\DFKEQVIHHFVAVILMTFSY 

SANLLRIGSLVLLLHDSSDYLLEACKMVNYMQY 

QQVCDALFLIFSFVFFYTRLVLFPTQILYTTYYESI 

SNRGPFFGYYFFNGLLMLLQLLHVFWSCLILRML 

YSFMKKGQMEKDIRSDVEESDSSEEAAAAQEPL 

QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 

T 


3540 


A 


267 


1397 


SPAGYCHSGLLPGCSRSA/CADLAKHQELPGKKL 

LSEKKLKRYFVDYRRVLVCGGNGGAGASCFHSE 

PRKEFGGPDGGDGGNGGHVILRVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYIRVPVGTL 

VKEGGRVVADLSCVGDEYIAALGGAGGKGNRF 

FLANNNRAPVTCTPGQPGQQRVLHLELKTVAHA 

GMVGFPNAGKSSLLRAISNARPAVASYPFTTLKP 

HVGIVHYEGHLQIAVADIPGIIRGAHQNRGLGSA 

FLRHIERCRFLLFVVDLSQPEPWTQVDDLKYELE 

MYEKGLSARPHAIVANKIDLPEAQANLSQLRDH 

LGQEVIVLSALTGENLEQLLLHLKVLYDAYAEA 

ELGQGRQPLRW 


3541 


A 


1 


8008 


DTQVSETLKRFAGKVTTASVKERREILSELGKCV 

AGKDLPEGAVKGLCKLFCLTLHRYRDAASRRAL 

QAAIQQLAEAQPEATAKNLLHSLQSSGIGSKAGV 

PSKSSGSAALLALTWTCLLVRIVFPSRAKRQGDI 

WNKLVEVQCLLLLEVLGGSHKHAVDGAVKKLT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location • 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=P]ienyla]anine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, Lr=LeudDe, M=Methionine, 
iv~'/\apitraginCj i^'^x ruiiiiC) \^~vyiuiaiuiDc, i^^ArgininC} o^^ocnnC} 
T=Threonine, V=Valine, W=Tryptoph8n, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudcotide deletion, 
\=possible nucleotide iusertioa 








1 


KLWKENPGLVEQYLSAILSLEPNQNYAGMLGLL 

VQFCTSHKEMDWSQHKSALLDFYMKNILMSK 

VKPPKYLLDSCAPLLRYLSHSEFKDLILPTIQKSL 

LRSPENVIETISSLLASVTLDLSQYAMDIVKGLAG 

HLKSNSPRLMDEAVLALRNLARQCSDSSAMESL 

TKHLFAILGGSEGKLTVVAQKMSVLSGIGSVSHH 

WSGPSSQVLNGIVAELFIPFLQQEVHEGTLVHA 

VSVLALWCNRFTMEVPKKLTEWFKKAFSLKTST 

SAVRHAYLQCMLASYRGDTLLQALDLLPLLIQT 

VEKAASQSTQVPTITEGVAAALLLLKLSVADSQA 

EAKLSSFWQLIVDEKKQVFTSEKFLVMASEDAL 

CTVLH\LTERLFLDHPHRLTGNKVQQYHRALVA 

VLLSRTWHVRRQAQQTVRKLLSSLGGFKLAHGL 

LEELKTVLSSHKVLPLEALVTDAGEVTEAGKAY 

VPPRVLQEALCVISGVPGLKGDVTDTEQLAQEM 

LIISHHPSLVAVQSGLWPALLARMKIDPEAFITRH 

LDQnPRMTTQSPLNQSSMNAMGSLSVLSPDRVL 

PQLISTITASVQNPALRLVTREEFAIMQTPAGELY 

DKSIIQSAQQDSKKANMKRENKAYSFKEQUELE 

LKEEIKKKKGIKEEVQLTSKQKEMLQAQLDREA 

QVRRRLQELDGELEAALGLLDIILAKNPSGLTQYI 

PVLVDSFLPLLKSPLAAPRIKNPFLSLAACVMPSR 

LKALGTLVSHVTLRLLKPECVLDKSWCQEELSV 

AVKRAVMLLHTHTITSRVGKGEPGAAPLSAPAFS 

LVFPFLKMVLTEMPHHSEEEEEWMAQILQILTVQ 

AQLRASPNTPPGRVDENGPELLPRVAMLRLLTW 

VIGTGSPRLQVLASDTLTTLCASSSGDDGCAFAE 

QEEVDVLLCALQSPCASVRETVLRGLMELHMVL 

PAPDTDEKNGLNLLRRLWVVKFDKEEEIRKLAE 

RLWSMMGLDLQPDLCSLLIDDVIYHEAAVRQAG 

AEALSQAVARYQRQAAEVMGRLMEIYQEKLYR 

PPPVLDALGRVISESPPDQWEARCGLALALNKLS 

QYLDSSQVKPLFQFFVPDALNDRHPDVRKCMLD 

AALATLNTHGKENVNSLLPVFEEFLKNAPNDAS 

YDAVRQSVVVLMGSLAKHLDKSDPKVKPIVAKL 

lAALSTPSQQVQESVASCLPPLVPAIKEDAGGMIQ 

RLMQQLLESDKYAERKGAAYGLAGLVKGLGILS 

LKQQEMMAALTDAIQDKKNFRRREGALFAFEM 

LCTMLGKLFEPYVVHVLPHLLLCFGDGNQYVRE 

AADDCAKAVMSNLSAHGVKLVLPSLLAALEEES 

WRTKAGSVELLGAMAYCAPKQLSSCLPNIVPKL 

TEVLTDSHVKVQKAGQQALRQIGSVIRNPEILAI 

APVLLDALTDPSRKTQKCLQTLLDTKFVHFIDAP 

SLALIMPrVQRAFQDRSTDTRKMAAQIIGNMYSL 

TDQKDLAPYLPSVTPGLKASLLDPVPEVRTVSAK 

ALGAMVKGMGESCFEDLLPWLMETLTYEQSSV 

DRSGAAQGLAEVMAGLGVEKLEKLMPEIVATAS 

KVDIAPHVRDGYIMMFNYLPITFGDKFTPYVGPII 

PCILKALADENEFVRDTALRAGQRVISMYAETAl 

ALLLPQLEQGLFDDLWRIRFSSVQLLGDLLFHISG 

VTGKMTTETASEDDNFGTAQSNKAIITALGVERR 

NRVLAGLYMGRSDTQLVVRQASLHVWKIVVSN 

TPRTLREELPTLFGLLLGFLASTCADKRTIAARTL 

GDLVRKLGEKILPEIIPILEEGLRSQKSDERQGVCI 

GLSEIMKSTSRDAVLYFSESLVPTARKALCDPLE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

lOCaUOu 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
currcopuuuiiig 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D^'Aspartic Acid» 
£=Glutamic Acid, F^Phenylalanine, G=GJyclne, H=Histidiiie, 
I=Isoleucine, K=Lysine, L>=Leuclne, M=Meth!onine, 

T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=lInknown, ^'^top codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 










EVREAAAKTFEQLHSTIGHQALEDILPFLLKQLD 

DEEVSEFALDGLKQVMAIKSRVVLPYLVPKLTTP 

PVNTRVLAFLSSVAGDALTRHLGVILPAVMLAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRmE 

DLLEATRSPEVGMRQAAAIILNIYCSRSKADYTS 

HLRSLVSGLIRLFNDSSPVVLEESWDALNAITKK 

LDAGNQLALIEELHKEIRLIGNESKGEHVPGFCLP 

KKGVTSILPVLREGVLTGSPEQKEEAAKALGLVI 

RLTSADALRPSWSITGPLIRILGDRFSWNVKAAL 

LETLSLLLAKVGIALKPFLPQLQTTFTKALQDSNR 

GVRLKAADALGmSIHIKVDPLFTELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDAVIRKNIVS 

LLLSMLGHDEDNTRISSAGCLGELCAFLTEEELS 

AVLQQCLLADVSGIDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMILSSATADRIPIAVSGV 

SSDIRLVAEKMIWWANKDPLPPLDPQAIKPILKA 
LLDNTKDKNTVVRAYSDQAIVNLLKMRQGEEVF 
QSLSKILDVASLEVLNEVNRRSLKKLASQADSTE 
QVDDTILT 


3542 


A 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAP 
GMP\GLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 
GASGLKGEPGATGSPGEPGYMGLPGIQGKKGDK 
GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 
GERGEKGEPGVRGAIGSKGESGVDGLMGPAGPK 
GQPGDPGPQGPPGLDGB^PGREFSEQFIRQVCTDV 

GPEGPRGLPGLPGRDGVPGLVGVPGRPGVRGLK 
GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 
SKEGPPGDPGLPGKDGDHGKPGIQGQPGPPGICD 
PSLCFSVIARRDPFRKGPNY 




A 




194 


PARST FKMKASVVLST T.GYT VVPSGAYIT GRCTV 

AKKLHDGGLDYFERYSLENWVCLAYFESKFNPS\ 
AIYENTREGYTGFGLFQMRGSDWCGDHGRNRC 
PIMSCSALLNPNLEKTIKCAKTT/KGKEGMGAWP 
TWSRYCQYSDTLARWLDGCKL 


3544 


A 


2 


1074 


SCRLAAGRLAQWLLRASRSGMLRAGWLRGAAA 

LALLLAARWAAFEPITVGLAIGAASAITGYLSY 

NDIYCRFAECCREERPLNASALKLDLEEKLFGQH 

LATEVIVFKALTGFRNNKNPKKPLTLSLHGWAGT 

GKNFVSQMGAENLHPKGLKSNFVHLFVSTLHFP 

HEQKIKLYQDQLQKWIRGNVSACANSVFIFDEM 

DKL\HPGIIE\AIKPFLDYYEHVERVSYR\KAIFIFLS 

NAGGDLITKTALDFWRAGRKREDIQLKDLEPVL 

SVGVFNNBCHSGLWHSGLIDKNLIDYFIPFLPLEYR 

HVKMCVRAEMRARGSAIDEDIVTRVAEEMTFFPV 

RDEKIYSDKGCKTVQSRLDFH 


3545 


A 


3 


273 


SAQGRSWGRFYRQIKRHPGIIPMIGLICLGMGSA 

ALYLLRLALRSPDVW*SWDRKNNPEPWNRLSPN 

DQYKFLAVSTDYKKLKKDRPDF 


3546 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVWLLWEAGAVPA 

* XXiJk^ X XJ * XX^ X X X^XTXXlkXXJL^a /.i * T X ^AmJ T ¥ » X^JUi » » JU^i XV.* « » v 41 X 

PKVPIKMQVKHWPSEQDPEKAWGARWEPPEK 
DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPMHQVLLGPEEDQDHIYHPQ*GSR 
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wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locfltioii 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponaing 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E>=Glutamic Acid, F=Ptienyla!anine, G=<?lycine, H^Histidine, 
Msoleucine, K=Lysine, li^Leuclne, M=Methionine, 
i>— Asparagine, r*=rronne, v^Viiuiaminej K—Argininey s^^aenney 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










GHHCPRPVPRPRLLGLGPSLPCPS 


3547 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVWLLWEAGAVPA 

PKVPIKMOVKHWPSEODPFKAWGARVVFPPFK 

DDQLWLFPVQKPKLLTTEEKPRGQGRGPILPGT 

KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 

EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 

GHHCPRPVPRPRLLGLGPSLPCPS 


3548 


A 


3 


1641 


TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGLALRFFKEKDGKAFHPTYEEKLKLVALHK 

QVLMGPYNPDTCPEVGFFDVLGNDRRREWAAL 

GNMSKEDAMVEFVKLLNRCCHLFSTYVASHKIE 

KEEQEKKRKEEEERRRREEEERERLQKEEEKRRR 

EEEERLRREEEERRRIEEERLRLEQQKQQIMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILIRQLQEQ 

HYQQYMQQLYQVQLAQQQAALQKQQEVVVAG 

SSLPTSSKVECNCTQVI*CQFNRQAKTHTDSSEKE 

LEPEAAEEALENGPKESLPVIAAPSMWTRPQIKD 

FK'FKTOnn AD^VTTVn'R HFWTVR VPTHFFn^sVT 

FWEFATDNYDIGFGVYFEWTDSI^NTAVSVHVSE 
SSDDDEEEEENIGCEEKAKKNANKPLLDEIVPVY 
RRDCHEEVYAGSHQYPGRGVYLLKFDNSYSLW 
RSKSVYYRVYYTR 


.3549 


A 


1837 


3593 


PAVLVLEPASQSRKQQNTASATAQHWSAQIHKE 

SFLAPVFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGATS 

PFPVSASTPKIGAISSLQGALGMDLSGILQAGLIHP 

VTGQIVNGSLRRDDAATRRRRGRRKHVEGGMD 

LIFLKEQTLQAGILEVHEDPGQATLSTTHPEGPGP 

ATSAPEPATAASSQAEKSBPSKSLLDWLRQQADY 

SLEVPGFGANFSDKPKQRRPRCKEPGKLDVSSLS 

GEERVPAIPKEPGLRGFLPENKFNHTLAEPILRDT 

GPRRRGRRPRSELLKAPSIVADSPSGMGPLFMNG 

LIAGMDLVGLQNMRNMPGIPLTGLVGFPAGFAT 

MPTGEEVKSTLSMLPMMLPGMAAVPQMFGVGG 

LLSPPMATTCTSTAPASLSSTTKSGTAVTEKTAE 

DKPSSHOVlCTnTLAEDlCPGPGPFSDOSEPATTTSS 

PVAFNPFLIPGVSPGLIYPSMFLSPGMGMALPAM 

QQARHSEIVGLESQKRKKKKTKGDNPNSHPEPA 

PSCEREPSGDENCAEPSAPLPAEREHGAOAGEGA 

LKDSNNDTN 


3550 


A 


287 


39 


QLNLNIOATSQKHRDFVAESVGEKPVGSLAGIGE 
VMDKKLEEGCFDKAYVVLGQFLVLKKDEDLF*E 
WLRDTGGARTRGSRE 


3551 


A 


21 


3925 


GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 

ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 

LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 

WNEQMLPKSQSVNGPSCQGLEPYNKVTYKPYQS 

SAQNNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQIIQLQVLNKAKERQLENLIEKLNESERQIRY 

LNHQLVIIKDEKDGLTLSLRESQKLFQNGKEREIQ 

LEAQIKALETQIQALKVNEEQMIKKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESIVMGLTKKY 

EEQVLSLQKNLDATVTALKEQEDICSRLKDHVK 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

CO r res po nd ing 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A=AIanine C^Cysteine, D=Aspartic Add, 
E=<;iutaroic Acid, F^Phenylalanine, G>=^Glycine, H^Histidine, 
I-Isoleucine, K=Lysine, Lr=Leucine, M=Metbionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, SBSerine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide Insertion 










QLERNQEADCLEKTEIINKLTRSLEESQKQCAHLL 

QSGSVQEVAQLQFQLQQAQKAHAMSANMNKA 

LQEELTELKDEISLYESAAKLGIHPSDSEGELNIEL 

TESYVDLGIKKVNWKKSKVTSIVQEEDPNEELSK 

DEFILKLKAEVQRLLGSNSMKRHLVSQLQNDLK 

DCHKKBEDLHQVKKDEKSIEVETKTDTSEKPKNQ 

LWPESSTSDWRDDILLLKNEIQVLQQQNQELKE 

TEGKLRNTNQDLCNQMRQMVQDFDHDKQEAV 

DRCERTYQQHHEAMKTQIRESLLAKHALEKQQL 

FEAYERTHLQLRSELDKLNKEVTAVQECYLEVC 

REKDNLELTLRKTTEKEQQTQEKIKEKLIQQLEK 

EWQSKLDQTIKAMKKKTLDCGSQTDQVTTSDVI 

SKKEMAIMIEEQKCTIQQNLEQEKDIAIKGAMKK 

LEIELELKHCENITKQVEIAVQNAHQRWLGELPE 

LAEYQALVKAEQKKWEEQHEVSVNKRISFAVSE 

AKEKWKSELENMRKNILPGKELEEKIHSLQKELE 

LKNEEVPVVIRAELAKARSEWNKEKQEEIHRIQE 

QNEQDYRQFLDDHRNKINEYLAAAKEDFMKQK 

TELLLQKETELQTCLDQSRREWTMQEAKRIQLEI 

YQYEEDILTVLGVLLSDTQKEHISDSEDKQLLEI 

MSTCSSKWMSVQYFEKLKGCIQKAFQDTLPLLV 

ENADPEWKKRNMAELSKDSASQGTGQGDPGPA 

AGHHAQPLALQATEAEADKKKVLEIKDLCCGHC 

FQELEKAKQECQDLKGKLEKCCRHLQHLERKHK 

AVVEKIGEENNKWEELIEENNDMKNKLEELQT 

LCKTPPRSLSAGAIENACLPCSGGALEELRGQYIK 

AVKKIKCDMLRYIQESKERAAEMVKAEVL*ERQ 

ETARKMRKYYLICLQQILQDDGKEGAEKKIMNA 

ASKLATMAKLLETPISSKSQSKTTQSGMSK 


3552 


A 


771 


375 


ARTRQTSGQAREPEKESPAPGGGGLAEIRSRQQL 
SQTSRIPPLAKDQAVEAMFPPARGKELLSFEDVA 
MYFTREEWGHLNWGQKDLYRDVMLE^^yKNMV 
LLVYFQFDAAIPLC*TSLAHSSWLQLYFRLYF 


3553 


A 


76 


72 


PGVRGVEAPGGVAPGRNAMRRGERRDAGGPRP 

ESPVPAGRASLEEPPDGPS AGQATGPGEGRRSTE 

SEVYDDGTNTFFWRAHTLTVLFILTCTLGYVTLL 

EETPQDTAYNTKRGIVASILVFLCFGVTQAKDGP 

FSRPHPAYWRFWLCVSVVYELFLIFILFQTVQDG 

RQFLKYVDPKLGVPLPERDYGGNCLIYDPDNET 

DPFHNIWDKLDGFVPAHFLGWYLKTLMIRDWW 

MCMnSVMFEFLEYSLEHQLPNFSECWWDHWIM 

DVLVCNGLGIYCGMKTLEWLSLKTYKWQGLWN 

IPTYKGKMKRIAFQFTPYSWVRFEWKPASSLRR 

WLAVCGIILVFLLAELNTFYLKFVLWMPPEHYLV 

LLRLVFFVNVGGVAMRErVT)FMDDPKPHKKLGP 

QAWLVAAITATELLIVVKYDPHTLTLSLPFYISQC 

WTLGSVLALTWTVWRFFLRDITLRYKETRWQK 

WQNKDDQGSTVGNGDQHPLGLDEDLLGPGVAE 

GEGAPTPN*PRGPAPRPLPSAPRAVCGASSRR 


3554 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location - 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystelne, D=Aspartic Acid, 
R=Glutamic Acid, F=Phenylalanine, G=Glydne, H^Histidine, 
I=Isoleucine, K=Lysine, Ir^Leucine, M=Methionine, 
N-Asparagine, P=ProIine, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=VaHne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /"^possible nucleotide deletion, 
\-possible nucleotide insertion 










HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3555 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 

HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3556 


A 


3388 


1650 


KTRGT]V^VV^P^A^LQRHTGCFATIWLAATRGSRL 

VKREYLRVNWKTCEEILNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRmMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEffiGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPVVPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3557 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 
VKREYLRVNWKTCEEILNYVLVRVQPPQPGLP 



382 



wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locfltion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, JD=Aspartic Acid, 
E^GIutamic Acid, F=PbenyIalanine, G=Glycine, H=HiStidine, 
I=Isoleucine, K'^Lysine, I/°L<ucine, M^'Methionine, 
N=Asparagine, P=Proline, Q^^GIutamine, R=Arginine, S=^erine, 
T=Threonine, V='Valine, W=Tryptoplian, Y=Tyro$ine, 
X=UnknowD, *=Stop codon, /^)ossible nucleotide deletion, 
yppossible nucleotide insertion 










RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQnUDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQi 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPVVPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3558 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEffiDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTE1PPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKAIAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3559 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVnCEIEDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFKMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3560 


A 


2 


1198 


FVRELPRPRPGAATAAIMVSVINTVDTSHEDMIH 
DAQMDYYGTRLATCSSDRSVKIFDVRNGGQELIA 



383 



wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^'Alanine C=Cysteine, D^Aspartic Acid, 
£=GIutamic Acid, F=Phenylalanine, G=GIyctne, H'^Histidine, 
Msoleucine, K^Lysine, JLr=Leucine, M=Methionine, 
N^Asparagine, P=Proline, O'GIutamine, R=Arginine, S==Scrinc, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *==Stop codon, /possible nucleotide deletion, 
V=posslble nucleotide insertion 










DLRGHEGPVWQVAWAHPMYGNILASCSYDRKV 

IIWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 

GLILACGSSDGAISLLTYTGEGQWEVKKINNAHT 

IGCNAVSWAPAVVPGSLIDHPSGQKPNYIKRFAS 

GGCDNLIKLWKEEEDGQWKEEQKLEAHSDWVR 

DVAWAPSIGLPTSTIASCSQDGRVFIWTCDDASS 

NTWSPKLLHKFNDVVWHVSWSITANILAVSGGD 

NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

EGQQNEQ*QDRWGLAPHPPAPGLPLPGPTNQTT 

GKSPQLQQDYFPRRSYRCSHRLIICLNVIGDAL 


3561 


A 


540 


86 


WRVKEMTSTLPKALGRKTASRSHTTLQGGSCCP 

VLWTAKLRCRKLRFPLPPPPPSSSAWPWQGWGI 

RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR 

YGEWRGSGQKTGQPS*TTMQGGETEENRTETTT 

GNKQRESEAPWVRHTYIT 


3562 


A 


1920 


242 


PMMAMPFFERFKSSIQRPSPVLVLSQNTKJIESGR 

KVQSGNINAAKTIADIIRTCLGPKSMMKMLLDP 

MGGIVMTNDGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVnLAGEMLSVAEHFLEQQMHPTV 

VISAYRKALDDMISTLKKISIPVDISDSDMMLNIIN 

SSITTKAISRWSSLACNIALDAVKMVQFEENGRK 

EIDIKKYARVEKIPGGIIEDSCVLRGVMINKDVTH 

PRMRRYIKNPRIVLLDSSLEYKKGESQTDIEITRE 

EDFTRILQMEEEYIQQLCEDIIQLKPDWITEKGIS 

DLAQHYLMRANITAIRRVRKTDNNRIARACGARI 

VSRPEELREDDVGTGAGLLEIKKIGDEYFTFITDC 

KDPKACTILLRGASKEILSEVERNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVIPRTLIQNCGASHRLLTSLR 

AKHTQENCETWGVNGETGTLVDMKELGIWEPL 

AVKLQTYKTA\^TAVLLLRIDDIVSGHKKKGDD 

QSRQGGAPDAGQE 


3563 


A 


1571 


560 


GPSLLGTRGTPNPARTLQIFFLIIGRRLTGRMAAV 

DDLQFEEFGNAATSLTANPDATTVNffiDPGETPK 

HQPGSPRGSGREEDDELLGNDDSDKTELLAGQK 

KSSPFWTFEYYQTFFDVDTYQVFDRIKGSLLPIPG 

KNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLS 

NFLIHLGEKTYHYVPEFRKVSIAATIIYAYAWLVP 

LALWGFLMWRNSKVMNIVSYSFLEIVCVYGYSL 

FIYIPTAILWIPHKAVRWILVMIALGISGSLLAMT 

FWPAVREDNRRVALATIVTIVLLHMLLSVGCLA 

YFFDAPEMDHLPTTTATPNQTVAAAKSS 


3564 


A 


1 


328 


NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGIIFTTFWGLVGIAGPWFVPKGPNRGVIITML 
VATAVCCYLFWLIAILAQLNPLFGPQLKNETIWY 
VRFLWE 


3565 


A 


2 


1081 


FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL 

RPFHLAAVRNEAVVISGRKLAQQIKQEVRQEVEE 

WVASGNKRPHLSVILVGENPASHSYVLNKTRAA 

AVVGmSETIMKPASISEEELLNLINKLNNDDNVD 

GLLVQLPLPEHIDERRICNAVSPDKDVDGFHVIN 

VGRMCLDQYSMLPATPWGVWEIIKRTGIPTLGK 

NVVVAGRSKNVGMPIAMLLHTDGAHERPGGDA 

TVTISHRYTPKEQLKKHTILADIVISAAGIPNLITA 

DMIKEGAAVIDVGINRVHDPVTAKPKLVGDVDF 



384 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F~Pbenyla)anine, G==Glycioe, H=Hfstidjne, 
I-Isoleucine, K=Lysine, L^'Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno>vn, *=Stop codon, /^possible nucleotide deletion, 
V=pos5ible nucleotide Insertion 










EGVRQKAGYITPVPGGVGPMTVAMLMKNTIIAA 
KKVLRLEEREVLKSKELGVATN 


3566 


A 


3 


1130 


SCRRGRQQQRW^SLSSQFAHTMAAPAQQTTQP 

GGGKRKGKAQYVLAKRARRCDAGGPRQLEPGL 

QGILITCNMNERKCVEEAYSLLNEYGDDMYGPE 

KFTDKDQQPSGSEGEDDDAEAALKKEVGDIKAS 

TEMRLRRFQSVESGANNVVFIRTLGIEPEKLVHHI 

LQDMYKTKKKKTOmRMLPISGTCKAFLEDMK 

KYAETFLEPWFKAPNKGTFQIVYKSRNNSHVNR 

EEVIRELAGIVCTLNSENKVDLTNPQYTVVVEirK 

AVCCLSVVKDYMLFRKYNLQEVVKSPKDPSQLN 

SKQGNGKEAKLESADKSDQNNTAEGKNNQQVP 

ENTEELGQTKPTSNPQVVNEGGAKPELASQATE 

GSKSNENDFS 


3567 


A 


248 


3498 


GKKDSSPWTCPFHPPLQLFFVIRNTRQLGDFHLA 

KIKVRNYWTADGDLDIGAKNVKLYVNRNLIFNG 

KLDKGDREAPADHSILVDQKNEKSEQLEEAMNA 

HSEESKGTHEMAGASGDKELGLGCSPPAETLAD 

AKLSSQGNVSGKRKNSTNCRKDSLSQLEEYLRLS 

AVPTSMGDMPSAPATSPPVKCPPVHEEPSLIQQL 

ENLMGRKICEPPGKTPSWLQPSPTGKDRKQGGR 

KPKPLWLSPEKPLAWKGRLPSDDVIGEGPGETEA 

RDKGLRHEPGWGTSRSVNTKERPQRATTKVHSD 

DSDIFNQPPNRERPASGRRGSRKDAGSSSHGDDQ 

PASREDTWSSRTPSRSRWRSEQEHTLHESWSSLS 

AFDRSHRGRISNTELPGDILDELLQQKSSRHSDLP 

PSKKGEQPGLSRGQDGYSGETDAGGDFKIPVLPY 

GQRLVIDIKSTWGDRHYVGLNGIEIFSSKGEPVQI 

SNIKADPPDINILPAYGKDPRVVTNLIDGVNRTQ 

DDMHVWLAPFTRGRSHSITIDFTHPCHVALIRIW 

>ryNKSRIHSFRGVKDITMLLDTQCIFEGEIAKASG 

TLAGAPEHFGDTILFTTDDDILEAIFYSDEMFDLD 

VGSLDSLQDEEAMRRPSTADGEGDERPFTQAGL 

GADERIPELELPSSSPVPQVTTPEPGIYHGICLQLN 

FTASWGDLHYLGLTGLEVVGKEGQALPIHLHQIS 

ASPRDLNELPEYSDDSRTLDKLIDGTNITMEDEH 

MWLIPFSPGLDHVVTIRLDRAESIAGLRFWNYNK 

SPEDTYRGAKIVHVSLDGLCVSPPEGFLIRKGPG 

NCHFDFAQEDLFVDYLRAQLLPQPARRLDMRSLE 

CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 

GLELYDERGEKIPLSENNIAAFPDSVNSLEGVGG 

DVRTPDKLIDQVNDTSDGRHMWLAPILPGLVNR 

VYVIFDLPTTVSMIKLWNYAKTPHRGVKEFGLL 

VDDLLVYNGILAMVSHLVGGILPTCEPTVPYHTI 

LFTEDRDIRHQEKHTTISNQAEDQDVQMMNENQ 

nTNAKRKQSVVDPALRPKTCISEKETRRRRC 


3568 


A 


50 


1724 


AQGGTLSAASRFCRGGLLGPWLHPASEMAATLD 

LKSKEEKDAELDKRIEALRRKNEALIRRYQEIEE 

DRKKAELEGVAVTAPRKGRSVEKENVAVESEKN 

LGPSRRSPGTPRPPGASKGGRTPPQQGGRAGMG 

RASRSWEGSPGEQPRGGGAGGRGRRGRGRGSPH 

LSGAGDTSISDRKSKEWEERRRQNIEKMNEEME 

KIAEYERNQREGVLEPNPVRNFLDDPRRRSGPLE 

ESERDRREESRRHGRNWGGPDFERVRCGLEHER 

QGRRAGLGSAGDMTLSMTGRERSEYLRWKQER 



385 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

correspond iog 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D^Aspartic Acid, 
£>=Glutamic Acid, F^Phenylalaoine, G=Glycine, H^Histidine, 
I=Isoleucinc, K^Lysine, L^L«Dcine, M=M etbionine, 
N~Asparagine, P=Proline, Q=01utaniine, R*='Ar]ginine, S"Scrioc, 
T=Threonlne, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X=Uni(nown, *=Stop codon, A^possible nudeotide deletion, 
V=possibIe nucleotide insertion 










EKIDQERLQRHRKPTGQWRREWDAEKTDGMFK 

DGPVPAHH>SHRYDDQAWARPPKPPTFGEFLSQ 

HKAEASSRRRRKSSRPQAKAAPRAYSDHDDRWE 

TKEGAASPAPETPQPTSPETSPKETPMQPPEIPAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEIEVE 

EGDEEEPAQDHQAPEAAPTGIPCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGEAWPFESV 


3569 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDIDAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKO 

A A^^^ ▼ JL^ A » fc^ J ■ J A A. A^^^M AM.'A ^^AA^A^AJ^^A^ ▼ A^AMM^^^mMA^A^,\^^ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 
RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 
PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 
VQKEQE 


3570 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDIDAIFKDLSIRSVRLVRJDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3571 


A 


28 


131 


RHFFGNLCAMRAKWRKKRMRRLKJUCRRKMRQ 
RSK 


3572 


A 


3 


1202 


QSEPHRKVRVDPPVRDRPPPHPPPLLVQRALPGQ 

GQAEGSDGADGAKRRAMAHQTGIHATEELKEFF 

AKARAGSVRLIKWIEDEQLVLGASQEPVGRWD 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WLFLAWSPDNSPVRLKMLYAATRATVKKEFGG 

GHIKDELFGTVKDDLSFAGYQKHLSSCAAPAPLT 

SAERELQQIRINEVKTEISVESKHQTLQGLAFPLQ 

PEAQRALQQLKQKMVNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP 

LESVVFIYSMPGYKCSIKERMLYSSCKSRLLDSV 

EQDFHLEIAKIODEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 


3573 


A 


49 


1869 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 

EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 

QVALENANAVSEGWHEDLRLLLETHLPSKKKK 

VLLGVGDPKIGAAIQEELGYNCQTGGVIAEILRG 

VRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKV 

KFNVNRVDNMIIQSISLLDQLDKDINTFSMRVRE 

WYGYHFPELVKIINDNATYCRLAQFIGNRRELNE 

DKLEKLEELTMDGAKAKAILDASRSSMGMDISAl 

DLENIESFSSRVVSLSEYRQSLHTYLRSKMSQVAP 

SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 

GAEKALFRALKTRGNTPKYGLIFHSTFIGRAAAK 

NKGRISRYLANKCSIASRIDCFSEVPTSVFGEKLR 

EQVEERLSFYETGEJPRKNLDVMKEAMVQAEAE 



386 



wo 01/57190 



PCTAJSOl/04098 



SEQm 
NO: 


Method 


Predicted 
beginning 
nucleotide 

IVIvilUUU 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

c o ri*p c n n n d i n p 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysleine, D-Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Htstidine, 
Msoleucine, K^Lysine, L=Lcucinc, M=Methionine, 
N^Asparagine, P'='Protine) Q^Glutamine, R^Arginine, S=^erinCf 
T=Threonine, V=Vanne, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=po55ibie nucleotide insertion 










EAAAEITRKIEKQEKKRLKKEKKRLAALALASS 

ENSSSTPEECEETSEKPKKKKKQKPQEVPQENGM 

EDPSISFSKPKKKKSFSKEELMSSDLEETAGSTSIP 

KRKKSTPKEETVNDPEEAGHRSRSKKKRKFSKEE 

PVSSGPEEAVGKSSSKKKKKFHKASQED 


3574 


A 


284 


2032 


CGNERTARLWVQPVVSTMPQASEHRLGRTREPP 

VNIQPRVGSKLPFAPRARSKERRNPASGPNPMLR 

PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 

DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 

RPMGIALGGHRGTGELGAALSRLALRPEPPTLRR 

STSLRRLGGFPGPPTLFSIRTEPPASHGSFHMISAR 

SSEPFYSDDKMAHHTLLLGSGHVGLRNLGNTCF 

LNAVLQCLSSTRPLRDFCLRRDFRQEVPGGGRA 

QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQ 

KYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGR 

RAPPILANGPVPSPPRRGGALLEEPELSDDDRANL 

MWKRYLEREDSKIVDLFVGQLKSCLKCQACGY 

RSTTFEVFCDLSLPIPKKGFAGGKVSLRDCFNLFT 

KEEELESENAPVCDRCROKTRSTKKLTVORFPRI 

LVLHLNRFSASRGSIKKSSVGVDFPLQRLSLGDF 

ASDKAGSPVYQLYALCNHSGSVHYGHYTALCR 

CQTGWHVYNDSRVSPVSENQVASSEGYVLFYQL 

MQEPPRCL 


3575 


A 


1 


.2408 


RELDSLADLPERIKPPYANGLSTSHLRSSSVEDVK 

LIISEGRPTIEVRRCSMPSVICEHTKQFQTISEESN 

QGSLLTVPGDTSPSPKPEVFSNVPERDLSNVSNIH 

SSFATSPTGASNSKYVSADRNLIKNTAPVNTVMD 

SPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDF 

ICPNSNIPDQESSLQSFCNSENKVLKENADFLSLR 

QTELPGNSCAQDPASFMPPQQPGSFPSQSLSDAES 

ISKHMSLSYVANQEPGILQQKNAVQnSSALDTD 

NESTKDTENTFVLGDVQKTDAFVPVYSDSTIQEA 

SPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAF 

SKLTYKSSSGHEVENSTTDTQVISHEKENKLESL 

VLTHLSRCDSDLCEMNAGMPKGNLNEQDPKHC 

PESEKCLLSIEDEESQQSILSSLENHSQQSTQPEM 

HKYGQLVKVELEENAEDDKTBNQIPQRMTRNK 

ANTMANQSKQILASCTLLSEKDSESSSPRGRIRLT 

EDDDPQIHHPRKRKVSRVPQPVQVSPSLLQAKEK 

TQQSLAAIVDSLKLDEIQPYSSERANPYFEYLHIR 

KKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLLD 

GNPLSKICIPTITPPPSLSDPLKELFRQQEVVRMKL 

RLQHSIEREKLIVSNEQEVLRVHYRAARTLANQT 

LPFSACTVLLDAEVYNVPLDSQSDDSKTSVRDRF 

NARQFMSWLQDVDDKFDKLKTCLLMRQQHEA 

AALNAVQRLEWQLKLQELDPATYKSISIYEIQEF 

YVPLVDVNDDFELTPI 


3576 


A 


5 


1421 


LRLAWHDGARWPLGTPRAAATRREAAALPPVT 

LALLCLDGVFLSSAENDFVHRIQEELDRFLLQKQ 

LSKVLLFPPLSSRLRYLIHRTAENFDLLSSFSVGE 

GWKRRTVICHQDIRVPSSDGLSGPCRAPASCPSR 

YHGPRPISNQGAAAVPRGARAGRWYRGRKPDQ 

PLYVPRVLRRQEEWGLTSTSVLKREAPAGRDPEE 

PGDVGAGDPNSDQGLPVLMTQGTEDLKGPGQR 

CENEPLLDPVGPEPLGPESQSGKGDMVEMATRF 



387 



wo 01/57190 



PCTAJSOl/04098 



SEQW 
NO: 


Method 


Predicted 
beginning 
nucleotide 

\ntatinn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, INAspartic Add, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histtdine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N^AsparaginCj P^Protine, Q^GIutamine, R=Argiiiinei S^ScrioCf 
T=Thrconine, V^Valine, W=Tryptophan, Y^Tyrosinc, 
X=Unlinown, *=Stop codon, /=possible nucleotide deletion^ 
\=:pos5ible nucleotide insertion 










GSTLQLDLEKGKESLLEKRLVAEEEEDEEEVEED 
GPSSCSEDDYSELLQEITDNLTKKEIQIEKIHLDTS 
SFMEELPGEKDLAHWEIYDFEPALKTEDLLATF 
SEFQEKGFRIQWVDDTHALGIFPCRASAAEALTR 
EFSVLKIRPLTQGTKQSKLKALQRPKLLRLVKER 
PQTNATVARRLVARALGLQHKKKERPAVRGPLP 
P 


3577 


A 


102 


1998 


DTRTPGSLEMGPLQFRDVAIEFSLEEWHCLDTAQ 

RNLYRNVMLENYSNLVFLGIWSKPDLIAHLEQG 

KKPLTMKRHEMVANPSGPVICSHFAQDLWPEQN 

IKDSFQKVILRRYEKRGHGNLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FSNSNRHNIRHTEKKPFKCIECGKAFNQFSTLITH 

KKfflTGEKPYICEECGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNQSSTLTKHKKIHTGEKPYKCEECGKAF 

NQSSTLTKHKKIHTGEKPYVCEECGKAFKYSRIL 

TTHKRIHTGEKPYKCNKCGKAFIASSTLSRHEFIH 

MGKKHYKCEECGKAFIWSSVLTRHKRVHTGEKP 

YKCEECGKAFKYSSTLSSHKRSHTGEKPYKCEEC 

GKAFVASSTLSKHEIIHTGKKPYKCEECGKAFNQ 

SSSLTKHKKIHTGEKPYKCEECGKAFNQSSSLTK 

HKKIHTGEKPYKCEECGKAFNQSSTLIKHKKIHT 

REKPYKCEECGKAFHLSTHLTTHKILHTGEKPYR 

CRECGKAFNHSATLSSHKKIHSGEKPYECDKCG 

KAFISPSSLSRHEIIHTGEKP 


3578 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKKNIQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNnSDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFIVDBLV 

TCGLSKNPYLSVKQKVEHIEWFKNYFNEKKDILK 

ESNIQFKLRPWKFLFRNN 


3579 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNNIQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKJa)LLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNIISDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATYNEQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGLSKNPYLSVKQKVEfflEWFRNYFNEKKDDLK 

ESNIQFKLRPWKFLFRNN 


3580 


A 


3673 


1619 


LYCVAPYSRHLLGRMSHLPMKLLRKKIEKRNLK 
LRQRNLKFQGASNLTLSETQNGDVSEETMGSRK 
VKKSKQKPMNVGLSETQNGGMSQEAVGNIKVT 



388 



wo 01/57190 



PCTAJSOl/04098 



SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glntamic Acid, F^^Phenylalanlne, G^GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=^Asparagine, P=Proline, Q=^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










KSPQKSTVLTNGEAAMQSSNSESKKKKKKKRK 

MVNDAEPDTKKAKTENKGKSEEESAETTKETEN 

NVEKPDNDEDESEVPSLPLGLTGAFEDTSFASLC 

NLVNENTLKAIKEMGFTNMmQHKSIRPLLEGR 

DLLAAAKTGSGKTLAFLIPAVELIVKLRFMPRNG 

TGVLILSPTRELAMQTFGVLKELMTHHVHTYGLI 

MGGSNRSAEAQKLGNGINnVATPGRLLDHMQN 

TPGFMYKNLQCLVIDEADRILDVGFEEELKQIIKL 

LPTRRQTMLFSATQTRKVEDLARISLKKEPLYVG 

VDDDKANATVDGLEQGYVVCPSEKRFLLLFTFL 

KKNRKKKLNTSnrSSCMSVKYHYELLNYIDLPVL 

AIHGKQKQNKRTTTFFQFCNADSGTLLCTDVAA 

RGLDIPEVDWIVQYDPPDDPKEYIHRVGRTARGL 

NGRGHALLILRPEELGFLRYLKQSKVPLSEFDFS 

WSKISDIQSQLEKLIEKNYFLHKSAQEAYKSYIRA 

YDSHSLKQIFNVNNLNLPQVALSFGFKVPPFVDL 

NVNSNEGKQKKRGGGGGFGYQKTKKVEKSKIF 

KHISKKSSDSRQFSH 


3581 


A 


23 


453 


LCRCICIKNITPHCLWDKVLSQFTYILDNLSNFMS 

HHPHSLRNSCLIRMDLLYWQFTIYTITFCFSHLSG 

RLTLSAQHISHRPCLLSYSLLFWKVHHLFLEGFPC 

SPRLDEMSFHQFPQHPVHVSWHLPIVYKGSMT 

QVSPH 


3582 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVR^JMSPDBIKIPPEPPGRC 

SNHLQDKIQKLYERKIKEGMDMNYnQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVXmQPTILTTTATLPAWTVTTSASGSKTTVIS 

AVGTIVKKAKQ 


3583 


A 


3 


950 


TRGCGNPIMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDEIKIPPEPPGRC 

SNHLQDKIQKLYERKIKJEGMDMNYIIQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAVVTVTTSASGSKTTVIS 

AVGTIVKKAKQ 


3584 


A 


3 


1139 


PGSTISSRADRLGAPVLAHPKMAERQEEQRGSPP 

LRAEGKADAEVKLILYHWTHSFSSQKVRLVIAE 

KALKCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 

LIHGENUCEATQHDYLEQTFLDERTPRLMPDKES 

MYYPRVQHYRELLDSLPMDAYTHGCILHPELTV 

DSMIPAYATTRIRSQIGNTESELKKLAEENPDLQE 

AYIAKQKRLKSKLLDHDNVKYLKKILDELEKVL 

DQVETELPRRNEETPEEGQQPWLCGESFTLADVS 

LAVTLHRLKFLGFARKNWGNGKRPNLETYYERV 

LKRKTFNKVLGHVNNILISAVLPTAFRVAKKRAP 

KVLGTTLVVGLLAGVGYFAFMLFRKRLGSMILA 

LRPRPNYF 



389 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

begioning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, P=Aspartic Acid, 
E^GIutamic Acid, F==Phenylalanine, G^^GIycine, H=Hjstidine, 
I^lsoleucine, K^Lysine, L^Leuclne, M=Methionine, 
N=Asparagine) P=Proline, Q^Glutaniine) R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, ^'^Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


3585 


A 


1 


1777 


RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

AVLWPAAGAWELTILHTNDVHSRLEQTSEDSSK 

CVNASRCMGGVARLFTKVQQIRRAEPNVLLLDA 

GDQYQGTIWFTVYKGAEVAHFMNALRYDAMA 

LGNHEFDNGVEGLIEPLLKEAKFPILSANIKAKGP 

LASQISGLYLPYKVLPVGDEWGIVGYTSKETPF 

LSNPGTNLVFEDEITALQPEVDKLKTLNVNKIIAL 

GHSGFEMDKLIAQKVRGVDWVGGHSNTFLYT 

GNPPSKEVPAGKYPFIVTSDDGRKVPWQAYAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSIPEDPS 

IKADINKWRIKLDNYSTQELGKTTVYLDGSSQSC 

RFRECNMGNLICDAMINN>a.RHTDEMFWNHVS 

MCILNGGGIRSPIDERNNGTITWENLAAVLPFGG 

TFDLVQLKGSTLKKAFEHS VHRYGQSTGEFLQV 

GGIHWYDLSRKPGDRVVKLDVLCTKCRVPSYD 

PLKMDEVYKVILPNFLANGGDGFQMIKDELLRH 

DSGDQDINVVSTYISKMKVIYPAVEGRIKFSTGS 

HCHGSFSLIFLSLWAVIFVLYQ 


3586 


A 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMIKEESDYHDLESVVQQVEQN 

LELMTKRAVKAENHVVKLKQEISLLQAQVSNFQ 

RENEALRCGQGASLTVVKQNADVALQNLRWM 

NSAQASIEQLVSGAETLNLVAEILKSIDRISEVKD 

EEEDS 


3587 


A 


88 


1639 


GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 

DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

RALTALFKEQRNRETAPRTIFQRVLDILKKSSHA 

VELACRDPSQVENLASSLQLITECFRCLRNACIEC 

SVNQNSIRNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQSJVWVHAFPELFLS 

CLNHPDKKIVAYSSMILFTSLNHERMKELEENLN 

lAIDVIDAYQKHPESEWPFLIITDLFLKSPELVQA 

MFPKLNNQERVTLLDLMIAKITSDEPLTICDDIPVF 

LRHAELIASTFVDQCKTVLKLASEEPPDDEEALA 

TIRLLDVLCEMTVNTELLGYLQVFPGLLERVIDL 

LRVIHVAGKETTNIFSNCGCVRAEGDISNVANGF 

KSHLIRLIGNLCYKNKDNQDKVNELDGIPLILDN 

CNISDSNPFLTQWVIYAIRNLTEDNSQNQDLIAK 

MEEQGLADASLLKKVGFEVEKKGEKLILKSTRD 

TPKP 


3588 


A 


3 


1462 


DSPRNRFEELGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPAEPLTPPPSYGHQPQT 

GSGESSGASGDKDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFNITDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PWSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RGVPTQAKGLCGSCNKPIAGQVVTALGRAWHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

RCGFCNQPIRHKMVTALGTHWHPEHFCCVSCGE 

PFGDEGFHEREGRPYCRRDFLQLFAPRCQGCQGP 

ILDNYISALSALWHPDCFVCRECFAPFSGGSFFEH 

EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 

GRRFHPDHFTCTFCLRPLTKGSFQERAGKPYCQP 

CFLKLFG 
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PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cysteine, D==Aspartic Acid, 
E=Glutamic Acid, F=FhcnylaIamne, G=GIydne, H=Histidine» 
I^Isoleucine, K=Lysinc, L=Leucine, M^Metbionine, 
N=Asparagine, P=Proline, Q=Giutamine, R^Arginine, S^SerinCy 
T=Threonine, V=Vaiine, W=Tryptophan, y=Tyrosine, 
X=Unknown, ^=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 


3589 


A 


226 


6793 


SPPKKSRKCNLSFRLISAERWRFFLLILMEMPRKP 

RLTLFVQRRIENIATEREFDPEEFYYLLEAAEGHA 

KEGQGKTDIPRYIISQLGLNKDPLEEMAHLGNY 

DSGTAETPETDESVSSSNASLKLRRKPRESDFETI 

KLISNGAYGAVYFVRHKESRQRFAMKKINKQNL 

ILRNQIQQAFVERDILTFAENPFWSMYCSFETRR 

HLCMVMEYVEGGDCATLMKNMGPLPVDMARM 

YFAETVLALEYLHOTGAOWDLKPDNLLVTSMG 

HIKLTDFGLSKVGLMSMTTNLYEGHIEKDAREFL 

DKQVCGTPEYIAPEVILRQGYGKPVDWWAMGII 

LYEFLVGCVPFFGDTPEELFGQVISDEINWPEKDE 

APPPDAQDLITLLLRQNPLERLGTGGAYEVKQHR 

FFRSLDWNSLLRQKAEFDPQLESEDDTSYFDTRSE 

KYHHMETEEEDDTNDEDFNVEIRQFSSCSHRFSK 

VFSSIDRITQNSAEEKEDSVDKTKSTTLPSTETLS 

WSSEYSEMQQLSTSNSSDTESNRHKLSSGLLPKL 

AISTEGEQDEAASCPGDPHEEPGKPALPPEECAQ 

EEPEVTTPASTISSSTLSVGSFSEHLDQINGRSECV 

DSTDNSSKPSSEPASHMARQRLESTEKKKISGKV 

TKSLSASALSLMIPGDMFAVSPLGSPMSPHSLSSD 

PSSSRDSSPSRDSSAASASPHQPIVIHSSGKNYGFT 

IRAIRVYVGDSDIYTVHHIVWNVEEGSPACQAGL 

KAGDLITHINGEPVHGLVHTEVIELLLKSGNKVSI 

TTTPFENTSIKTGPARRNSYKSRMVRRSKKSKKK 

ESLERRRSLFKKLAKQPSPLLHTSRSFSCLNRSLS 

SGESLPGSPTHSLSPRSPTPSYRSTPDFPSGTNSSQ 

SSSPSSSAPNSPAGSGfflRPSTLHGLAPKLGGQRY 

RSGRRKSAGNIPLSPLARTPSPTPQPTSPQRSPSPL 

LGHSLGNSKIAQAFPSKMHSPPTIVRHIVRPKSAE 

PPRSPLLICRVQSEEKLSPSYGSDKKHLCSRKHSL 

EVTQEEVQREQSQREAPLQSLDENVCDVPPLSRA 

RPVEQGCLKRPVSRKVGRQESVDDLDRDKLKAK 

VVVKKADGFPEKQESHQKFHGPGSDLENFALFK 

LEEREKKVYPKAVERSSTFENKASMQEAPPLGSL 

LKDALHKQASVRASEGAMSDGPVPAEHRQGGG 

DFRRAPAPGTLQDGLCHSLDRGISGKGEGTEKSS 

QAKELLRCEKLDSKLANTOYLRKKMSLEDKEDN 

LCPVLKPKMTAGSHECLPGNPVRPTGGQQEPPPA 

SESRAFVSSTHAAQMSAVSFVPLKALTGRVDSGT 

EKPGLVAPESPVRKSPSEYKLEGRSVSCLEPIEGT 

LDIALLSGPQASKTELPSPESAQSPSPSGDVRASV 

PPVLPSSSGKKNDTTSARELSPSSLKMNKSYLLEP 

WFLPPSRGLQNSPAVSLPDPEFKRDRKGPHPTAR 

SPGTVMESNPQQREGSSPKHQDHTTDPKLLTCLG 

QNLHSPDLARPRCPLPPEASPSREKPGLRESSERG 

PPTARSERSAARADTCREPSMELCFPETAKTSDN 

SKNLLSVGRTHPDFYTQTQAMEKAWAPGGKTN 

HKDGPGEARPPPRDNSSLHSAGBPCEKELGKVRR 

GVEPKPEALLARRSLQPPGDBSEKSEKLSSFPSLQ 

KDGAKEPERKEQPLQRHPSSIPPPPLTAKDLSSPA 

ARQHCSSPSHASGREPGAKPSTAEPSSSPQDPPKP 

VAAHSESSSHKPRPGPDPGPPKTKHPDRSLSSQK 

PSVGATKGKEPATQSLGGSSREGKGHSKSGPDVF 

PATPGSQNKASDGIGQGEGGPSVPLHTDRAPLDA 

KPQPTSGGRPLEVLEKPVHLPRPGHPGPSEPADQ 
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PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
Ppucleotide 
locatioD 

pnrfPctinnHincF 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^'Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F^Phenylalanine, G^Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, M-Methionine, 

N=Acnnriicrinp P=Prn1inp 0=Oliitnniinp R=Arpininp S=Sf^rinf 

T=Tlireonine, V=Valine, W=Tryptoplian, Y^Tyrosine, 
X'=UDknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KLSAVGEKQTLSPKHPKPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTDRRAEGKKCTEALYAPAEG 

DKLEAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPQKPPTEADKPNGMKRSPSATGQSSFRS 

TALPEKSLSCSSSFPETRAGVREASAASSDTSSAK 

AAGGMLELPAPSNRDHRKAQPAGEGRTHMTKS 

DSLPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDVTKP 

SPAPNTDRPISLSNEKDFVVRQRRGKESLRSSPHK 

KAL 


3590 


A 


3 


935 


RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDIESDGK 

ETLETISNEEQTPLLKKINPTESTSKAEENEKVDS 

KVKAFKKPLSVFKGPLLHISPAEELYFGSTESGEK 

KTLIVLTNVTKNIVAFKVRTTAPEKYRVKPSNSS 

CDPGASVDIVVSPHGGLTVSAODRFLIMAAEME 

QSSGTGPAELTQFWKEVPRNKVMEHRLRCHTVE 

SSKPNTLTLKDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQLLLSLTMLLLAFVTSFFY 

LLYS 


3591 


A 


303 


2 


GGSWGPLCPVSPAMSLSDPGLGYHPTCWTLRWP 
PLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 


3592 


A 


1052 


1779 


GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 

RDDVIVSPQTVQVKGENGNLVITPDGNVMYNGK 

QYSLNAAQREQAKDYQAELRSTLPWDDEGAKSR 

VEKARIALDKIIVOEMGESSKMRSRLTKLDAOVK 

EQMNRJIETRSDGLTFHYKAIDQVRAEGQQLVNQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQSSIQTEWKKQEKDFQQFGKDVCSRVVTLE 

DSRKALVGNLK 


3593 


A 


3 


1837 


LSFEKVDIQTDNDLTKEMYEGBCENVSFELQRDFS 

QETDFSEASLLEKQQEVHSAGNIKKEKSNTIDGT 

VKDETSPVEECFFSQSSNSYQCHHTGEQPSGCTG 

LGKSISFDTKLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQENNTEEKPYQCSECGKAFSINEKLIWH 

QRLHSGEKPFKCVECGKSFSYSSHYITHQTIHSGE 

KPYQCKMCGKAFSVNGSLSRHQRIHTGEKPYQC 

KECGNGFSCSSAYITHQRVHTGEKPYECNDCGK 

AFNGNAKLIQHQRIHTGEKPYECNECGKGFRCSS 

QLRQHQSIHTGEKPYQCKECGKGFNNNTKLIQH 

QRIHTASLAEQLFKASGNHPNWGCCLTISSPGPS 

VYGPKMNMRGAPNSRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWNVTDYDSSWYRQKQVLSGV 

WSSPLSILKLPRTLIRISIHIQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEQTPEGARELSPLQESSSPGGVKAEEEQRAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSVVL 

DD 


3594 


A 


39 


261 


RAAMMDTSRVQPIKLAIVIKVLGRTGSQGQCTQ 

VRVEFMDDTSRSHRSVKGPVREGDVLTLLESERE 

ARRLR 


3595 


A 


973 


68 


GRVGTKHQMADDAGAAGGPGGPGGPGMGNRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
rnrrpsnnndinp 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q==Glutamine, R=Arginine, S^^SerinCy 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X'^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 










DKEWMPVTKLGRLVKDMKIKSLEEiyLFSLPIKE 

SEIIDFFLGASLKDEVLKIMPVQKQTRAGQRTRF 

KAFVAIGDYNGHVGLGVKCSKEVATAIRGAIILA 

KLSIVPVRRGYWGNKIGKPHTVPCKVTGRCGSV 

LVRLIPAPRGTGIVSAPVPKKLLMMAGIDDCYTS 

ARGCTATLGNFAKATFDAISKTYSYLTPDLWKE 

TWTKSPYQEFTDHLVKTHTRVSVQRTQAPAVA 

TT 


3596 


A 


106 


2960 


DERRVGAADMFGRSRSWVGGGHGKTSRNIHSL 

DHLKYLYHVLTKNTTVTEQNRNLLVETIRSITEIL 

IWGDQNDSSVFDFFLEKNMFVFFLNILRQKSGRY 

VCVQLLQTLNILFENISHETSLYYLLSNNYVNSII 

VHKFDFSDEEIMAYYISFLKTLSLKLNNHTVHFF 

YNEHTNDFALYTEAIKFFNHPESMVRIAVRTITL 

NVYKVSLDNQAMLHYIRDKTAVPYFSNLVWFIG 

SHVIELDDCVQTDEEHRNRGKLSDLVAEHLDHL 

HYLNDILIINCEFLNDVLTDHLLNRLFLPLYVYSL 

ENQDKGGERPKISLPVSLYLLSQVFLIIHHAPLVN 

SLAEVILNGDLSEMYAKTEQDIQRSSAKPSIRCFI 

KPTETLERSLENmKHKGKRRVQKRPNYKNVGEE 

EDEEKGPTEDAQEDAEKAKGTEGGSKGIKTSGES 

EEIEMVIMERSKLSELAASTSVQEQNTTDEEKSA 

AATCSESTQWSRPFLDMVYHALDSPDDDYHALF 

VLCLLYAMSHNKGMDPEKLERIQLPVPNAAEKT 

TYNHPLAERLIRIMNNAAQPDGKIRLATLELSCL 

LLKQQVLMSAGCIMKDVHLACLEGAREESVHLV 

RHFYKGEDIFLDMFEDEYRSMTMKPMNVEYLM 

MDASILLPPTGTPLTGIDFVKRLPCGDVEKTRRAI 

RVFFMLRSLSLQLRGEPETQLPLTREEDLIKTDDV 

LDLNNSDLIACTVITKDGGMVQRSLAVDIYQMS 

LVEPDVSRLGWGVVKFAGLLQDMQVTGVEDDS 

RALNITIHKPASSPHSKPFPILQATFIFSDHIRCIIAK 

QRLAKGRIQARRMKMQRIAALLDLPIQPTTEVLG 

FGLGSSTSTQHLPFRFYDQGRRGSSDPTVQRSVF 

ASVDKVPGFAVAQCINEHSSPSLSSQSPPSASGSP 

SGSGSTSHCDSGGTSSSSTPSTAQSPAGIGHVTQ 


3597 


A 


427 


277 


GVRRIQHHWAQMHECNVHTYASLFCLFLLHTG 
KLCCLNSHRHFHCIKYSK 


3598 


A 


1 


503 


FRPRTKKATAMYLEHYLDSIENLPCELQRNFQL 

MRELDQRTEDKKAEBDILAAEYISTVKTLSPDQR 

VERLQKIQNAYSKCKEYSDDKVQLAMQTYEMV 

DKHIRRLDADLARFEADLKDKMEGSDFESSGGR 

GLKKGRGQKEKRGSRGRGRRTSEEDTPKKKKH 

KGG 


3599 


A 


2 


3907 


KTITALAFSPDGKYLVTGESGHMPAVRVWDVAE 

HSQVAELQEHKYGVACVAFSPSAKYIVSVGYQH 

DMIVNVWAWKKNIVVASNKVSSRVTAVSFSED 

CSYFVTAGNRHIKFWYLDDSKTSKVNATVPLLG 

RSGLLGELRNNLFTDVACGRGKKADSTFCITSSG 

LLCEFSDRRLLDKWVELRVYPEVKDSNQACLPP 

SSFITCSSDNTIRLWNTESSGVHGSTLHRNILSSDL 

IKIIYVDGNTQALLDTELPGGDKADASLLDPRVGI 

RSVCVSPNGQHLASGDRMGTLRVHELQSLSEML 

KVEAHDSEILCLEYSKPDTGLKLLASASRDRLIH 

VLDAGREYSLQQTLDEHSSSITAVKFAASDGQVR 



393 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
EF'Glutamic Acid, F=Plienylalanine, G=Glycine, H=Hi$tidine, 
I=IsoIeucine, K=Lysine, Ir=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vallne, W=Tryptophan, Y=Tyrosine, 
X=Unl{nown, '^^top codon, /=pos»ble nucleotide ddelion, 
V^possible nucleotide insertion 










MISCGADKSIYFRTAQKSGDGVQFTRTHHWRK 

TTLYDMDVEPSWKYTAIGCQDRNIRIFNISSGKQ 

KKLFKGSQGEDGTLIKVQTDPSGIYIATSCSDKNL 

SIFDFSSGECVATMFGHSEIVTGMKFSNDCKHLIS 

VSGDSCIFVWRLSSEMTISMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEELPALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG 

RWVQPGVELSVRSMLDLRQLETLAPSLQDPSQD 

SLAIIPSGPRKHGQEALETSLTSQNEKPPRPQASQ 

PCSYPHIIRLLSQEEGVFAQDLEPAPIEDGIVYPEP 

SDNPTMDTSEFQVQAPARGTLGRVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTEPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFETLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRSISVGEN 

LGLVAEPQAHAPIRVSPLSKLALPSRAHLVLDIPK 

PLPDRPTLAAFSPVTKGRAPGEAEKPGFPVGLGK 

AHSTTERWACLGEGTTPKPRTECQAHPGPSSPCA 

QQLPVSSLFQGPENLQPPPPEKTPNPMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRDTFSSVRQELEAV 

AGAVLSSPGSSPGAVGAEQTQALLEQYSELLLRA 

VERRMERKL 


3600 


A 


1688 


916 


IPGSTISCSMALCEAAGCGSALLWPRLLLFGDSIT 

QFSFQQGGWGASLADRLVRKCDVLNRGFSGYN 

TRWAKIELPRLIRKGNSLDIPVAVTIFFGANDSAL 

KDENPKQHIPLEEYAANLKSMVQYLKSVDIPENR 

VILITPTPLCETAWEEQCIIQGCKLNRLNSWGEY 

ANACLQVAQDCGTDVLDLWTLMQDSQDFSSYL 

SDGLHLSPKGNEFLFSHLWPLIEKKVSSLPLLLPY 

WRDVAEAKPELSLLGDGDH 


3601 


A 


44 


223 


VHFPLIPQLAKCFWTMNRAARNKSEKRYYSEFL 

QIAHLFNYGLSSFLREFIIFLIKLLQ 


3602 


A 


37 


1124 


VPKPASGKRRLEFRPQDSKACAATPHSPGRITSR 

TRGSQKVRSVPPRLPWAQASASTDWEGLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIFPAAHFLWCNLHTPRRPACNAPWHSPVGEI 

SPPPRESQLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPRQIPSHIVRLKPSCSTDSSF 

TRTPVPTVSLASRELPVSSWQVTEPSSKNLWEQI 

CKEYEAEQPPFPEGYKVKQEPVITVAPVEEMLFH 

GFSAEHYFPVSHFTMISRTPCPQDKSETINPKTCS 

PKEYLETFIFPVLLPGMASLLHQAKKEKCFEWL 

QMTPSGGKACVWGHLPSSSHTI 


3603 


A 


286 


587 


NISNKAEVSSHPSVISHSMDSFGQPRPEDNQSVLR 

RMQKKYWKtKQVFIKATGKKEDEHLVASDAEL 

DAKLEVFHSVQETCTELLKIIEKYQLRLNGMKS 


3604 


A 


103 


2440 


QPRRRVFPAAGRGPGRKCSQWGRQASVSFEDVT 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 
VGYQIPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid,F=PhenylalaniDe,G=Glycine,H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q^GIntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=^top codon, /=pos$ible nucleotide ddetion, 
\=possible nucleotide insertion 










GEAIGKMQQQGIPGGIFFHCERFDQPIGEDSLCSI 

LEELWQDNDQLEQRQENQNNLLSHVKVLIKERG 

YEHKl^KIfflVTTKLWSIKRLHNODTllJfflTLN 

SHNHNRNSATKNLGKIFGNGNNFPHSPSSTKNEN 

AKTGANSCEHDHYEKHLSHKQAPTHHQKIHPEE 

KLYVCTECVMGFTQKSHLFEHQRIHAGEKSREC 

DKSNKVFPQKPQVDVHPSVYTGBKPYLCTQCGK 

VFTLKSNLITHQJaHTGQKPYKCSECGKAFFQRS 

DLFRHLRIHTGEKPYECSECGKGFSQNSDLSfflQ 

KTHTGEKHYECNECGKAFTOKSALRMHQRIHTG 

EKPYVCADCGKAnQKSHFNTHQRIHTGEKPYEC 

SDCGKSFTKKSQLHVHQRIHTGEKPYICTECGKV 

FTHRTNLTTHQKTHTGEKPYMCAECGKAFTDQS 

NLIKHQKTHTGEKPYKCNGCGKAFIWKSRLKIH 

QKSHIGERHYECKDCGKAnQKSTLSVHQRIHTG 

EKPYVCPECGKAHQKSHFIAHHRIHTGEKPYECS 

DCGKCFTKKSQLRVHQKIHTGEKPNICAECGKAF 

TDRSNLITHQKIHTREKPYECGDCGKTFTWKSRL 

NIHQKSHTGERHYECSKCGKAFIQKATLSMHQII 

HTGKKPYACTECQKAFTDRSNLIKHQKMHSGEK 

RYKASD 


3605 


A 


3 


322 


SFRMSGRGKGGKGLGKGGAKRHRKVLRDNIQGl 
TKPAIRRLARRGGVKRISGLIYEETRGVLKVFLEN 
VIRDAVTYTEHAKRKTVTAMDWYALKRQGRT 

LYGFGG 


3606 


A 


1 


1749 


VPVTAEAKLMGFTQGCVTFEDVAIYFSQEEWGL 

LDEAQRLLYRDVMLENFALITALVCWHGMEDE 

ETPEQSVSVEGVPQVRTPEASPSTQKIQSCDMCV 

PFLTDILHLTDLPGQELYLTGACAVFHQDQKHHS 

AEKPLESDMDKASFVQCCLFHESGMPFTSSEVG 

KDFLAPLGILQPQAIANYEKPNKISKCEEAFHVGI 

SHYKWSQCRRESSHKHTFFHPRVCTGKRLYESS 

KCGKACCCECSLVQLQRVHPGERPYECSECGKS 

FSQTSHLNDHRRIHTGERPYVCGQCGKSFSQRAT 

LIKHHRVHTGERPYECGECGKSFSQSSNLIEHCRI 

HTGERPYECDECGKAFGSKSTLVRHQRTHTGEK 

PYECGECGKLFRQSFSLVVHQRIHTTARPYECGQ 

CGKSFSLKCGLIQHQLIHSGARPFECDECGKSFSQ 

RTTLNKHHKVHTAERPYVCGECGKAFMFKSKL 

VRHQRTHTGERPFECSECGBCFFRQSYTLVEHQKI 

HTGLRPYDCGQCGKSFIQKSSLIQHQWHTGERP 

YECGKCGKSFTQHSGLILHRKSHTVERPRDSSKC 

GKPYSPRSNIV 


3607 


A 


92 


331 


AMAGPGPGPGDPDEQYDFLFKLVLVGDASVGKT 
CWQRFKTGAFSERQGSTIGVDFTMKTLEIQGKR 

VKLQIWDTAGQER 


3608 


A 


545 


379 


AIKGYIHLSAPKNRYMHTTASNGRjMLFMKVTM 
YMRRGVQIMGWSVRMAFMACFTQ 


3609 


A 


lis 


873 


VWMAWQVSLLELEDRLQCPICLEVFKESLMLQC 

GHSYCKGCLVSLSYHLDTKVRCPMCWQWDGS 

SSLPNVSLAWVIEALRLPGDPEPKVCVHHRNPLS 

LFCEKDQELICGLCGLLGSHQHHPVTPVSTVCSR 

MKEELAALFSELKQEQKKVDELIAKLVKNRTRTV 

NESDVFSWVIRREFQELRHPVDEEKARCLEGIGG 

HTRGLVASLDMQLEQAQGTRERLAQAECVLEQF 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amiiio 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cy$teine, D=Aspartic Acid, 
E=GIutamic Acid, F=PlienyIalanine, G=Glycine, H'^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Metbionine, 
N-Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, V=Tyrosine, 
X>=l)nknown, *=Stop codon, A^possible nndeotide deletion, 
^possible nucleotide insertion 










GNEDHHEFIWKFHSMASR 


3610 


A 


2 


987 


DPRVRPPLLQPPPPLLPRLVILKMAPLDLDKYVEI 

ARLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

TPVTVCGDIHGQFYDLCELFRTGGQVPDTNYIFM 

GDFVDRGYYSLETFTYLLALKAKWPDRITLLRG 

NHESRQITQVYGFYDECQTKYGNANAWRYCTK 

VFDMLTVAALIDEQELCVHGGLSPDIKTLDQmTI 

EKNQEIPHKGAFCDLVWSDPEDVDTWAISPRGA 

GWLFGAKVTNEFVHINNLKLICRAHQLVHEGYK 

FMFDEKLVWWSAPNYCYRCGNIASIMVFKDVN 

TREPKLFRAVPDSERVIPPRTTTPYFL 


3611 


A 


245S! 


869 


AEKMTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAWLKATQEAPAASTLGSYSLPG 

TLAKSEILETHGTMNFLGAETKNLQLLVPKTEIC 

EEAEKPLIISERIQKADPQGPELGEACEKGNMLK 

RQRIKREKKDFRQVrVKDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLWHQRIHTGEKPFECHECGKAFIQSAN 

LWHQRIHTGQKPYVCSKCGKAFTQSSNLTVHQ 

KIHSIyEKTFKCNECEKAFSYSSQLARHQKVHITE 

KCYECNECGKTFTRSSNLIVHQRIHTGEKPFACN 

DCGKAFTQSANLIVHQRSHTGEKPYECKECGKA 

FSCFSHLrVHQRIHTAEKPYDCSECGKAFSQLSCL 

IVHQRJHSGDLPYVCNECGKAFTCSSYLLIHQRIH 

NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP 

YECEKCGAAFISNSHLMRHHRTHLVE 


3612 


A 


318 


2245 


SPMAEAALVNTPQEPMVTEEFVKPSQGHVTFEDI 

AVYFSQEEWGLLDEAQRCLYHDVMLENFSLMA 

SVGCLHGIEAEEAPSEQTLSAQGVSQARTPKLGP 

SIPNAHSCEMCILVMKDILYLSEHQGTLPWQKPY 

TSVASGKWFSFGSNLQQHQNQDSGEKHIRKEESS 

ALLLNSCKIPLSDNLFPCKDVEKDFPTILGLLQHQ 

TTHSRQEYAHRSRETFQQRRYKCEQVFNEKVHV 

TEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPYVCNICGKSFLHKQTLVGHQQRIHTRE 

RSYVCIECGKSLSSKYSLVEHQRTHNGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SFIHSYDRIRHQRVHTGEGAYQCSECGKSFIYKQ 

SLLDHHRIHTGERPYECKECGKAFIHKKRLLEHQ 

RIHTGEKPYVCIICGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAFISKQTLLKHHKIHTRERPYECSE 

CGKGFYLEVKLLQHQRIHTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

RIHTGEKPYECGKCGKAFNKRYSLVRHQKVHIT 

EEP 


3613 


A 


817 


3345 


NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 

KREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 

PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

TGLDYSPPSAPRSVPVATTLPAAYATPQPGTPVSP 

VQYAHLPHTFQFIGSSQYSGTYASFIPSQLIPPTAN 

PVTSAVASAAGATTPSQRSQLEAYSTLLANMGS 

lsqtpghkaeqqqqqqqqqqqqqqqqqqqqq 

QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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S£QID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inool'irkii 

corresponding 
to flrst amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

miTPCnniiHino 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cyste(ne, D=Aspartic Add, 
E^Glutamic Acid, F=Phenylalanine, G^lycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucinc M=Methionine, 

T=Threonine, V=VaIine, W=Tryptophan, Y>=Tyrosine, 
X=UnkDO\vn, *=Stop codon, /=pos$ible audeotlde deletton, 
\=po$sible nucleotide insertion 










PAQQNQYVfflSSSPQNTGRTASPPAIPVHLHPHQ 

TMIPHTLTLGPPSQVVMQYADSGSHFVPREATK 

KAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSA 

DLGLGKAGGKSVPHPYESRHWVHPSPSDYSSR 

DPSGVRASVMVLPNSNTPAADLEVQQATHREAS 

PSTLNDKSGLHLGKPGHRSYALSPHTVIQTTHSA 

SEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITY 

AGSLPQHLVIPGTQPLLIPVGSTDMEASGAAPAIV 

TSSPQFAAVPHTFVTTALPKSENFNPEALVTQAA 

YPAMVQAQIHLPWQSVASPAAAPPTLPPYFMK 

GSHQLANGELKKVEDLKTEDFIQSAEISNDLKroS 

STVERIEDSHSPGVAVIQFAVGEHRAQVSVEVLV 

EYPFFVFGOGWSSCCPERTSOLFDLPCSKLSVGD 

VCISLTLKNLKNGSVKKGQPVDPASVLLKHSKA 

DGL AG SRHRY AEQENGINQGS AQMLSENGELKF 

PEKMGLSAAPFLTKIEPSKPAATRKRRWSAPESR 

KLEKSEDEPPLTLPKPSLIPQEVKICIEGRSNVGK 


3614 


A 


3 


114 


FFESRLRCKCCEPRGSWARFGCWRLQPEFKPKQ 
LEG 


3615 


A 


3 


1603 


DAWALTNQFSDSKQHIEVLKESLTAKEQRAAILQ 

TEVDALRLRLEEKETMLNKKTKQIQDMAEEKGT 

QAGEIHDLKDMLDVKERKVNVLQKKIENLQEQL 

RDKEKQMSSLKERVKSLQADTTNTDTALTTLEE 

ALAEKERTIERLKEQRDRDEREKQEEIDNYKKDL 

KDLKEKVSLLQGDLSEKEASLLDLKEHASSLASS 

GLKKDSRLKTLEIALEQKKEECLKMESQLKKAH 

EAALEARASPEMSDRIQHLEREITRYKDESSKAQ 

AEVDRLLEILKEVENEKNDKDKKIAELESLTSRQ 

VKDQNKKVANLKHKEQVEKKKSAQMLEEARRR 

EDNLNDSSQQLQDSLRKKDDRIEELEEALRESVQ 

ITAEREMVLAOEESARTNAEKOVEELLMAMEKV 

KQELESMKAKLSSTQQSLAEKETHLTNLRAERR 

KHLEEVLEMKQEALLAAISEKDANIALLELSSSK 

KKTQEEVAALKREKDRLVQQLKQQTQNRMKLM 

ADNYEDDHFKSSHSNQTNHKPSPDQDEEEGI WA 


3616 


A 


244 


1420 


rrrwrargglvptlawaeatgayvpgrdkpdl 

ptwkrnfrsalnrkeglrlaedrskdphdphki 

yefvnsgvgdfsqpdtspdtngggstsdtqedil 

dellgnmvlaplpdpgppslavapepcpqplrsps 

ldnptpfpnlgpse>a>lkrllvpgeewefevtaf 

Yrgrqvfqqtiscpeglrlvgsevgdrtlpgwp 

vtlpdpgmsltdrgvmsyvrhvlsclggglal 

wragowlwaorlghchtywavseellpnsgh 

gpdgevpkdkeggvfdlgpfivgslgppdlitfte 

gsgrspryalwfcvgeswpqdqpwtkrlvmvk 

WPTCLRALVEMARVGGASSLENTVDLfflSNSHP 

lsltsdqykaylqdlvegmdfqgpges 


3617 


A 


852 


304 


rggllskmarvlkaaaanavglfsrlqapiptv 
rasstsopldovtgsvwnlgrlnhvaiavpdle 
kaaafyknilgaqvseavplpehgvsvvfvnlg 
ntkmellhplgrdspiagflqknkaggmhhicie 
vdninaavmdlkkkkirslseevkigahgkpvif 
lhpkdcggvlveleqa 


3618 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 
DDDMEGDEAVVRCTLSANMYVDEILVWCASEL 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first ammo 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

corrpisnnndino 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalamne, G=G!ycinc, H=Histldine, 
I=IsoIeucine, K=Lysine, Lr=Leucine, M=Methionine, 

'MsAcniiraoinp P=Prf)line^ O^Crlutamine. R=ArPminp S^jQprinft. 

T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
¥=possib1e nudeotide insertion 










NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVffiA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKTKVALMCMLREIGKHINMDGTINVDDFKUYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQirVCTPEKWDIITRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPVPLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVFVHSRKETGKTARAIRDMCLEBCD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVnKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEIE 

LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 

ESEEEPSAKINVLLQAFISQLKLEGFALMADMVY 

VTQSAGRLMRAIFEIVLNRGWAQLTDKTLNLCK 

MIDKRMWQSMCPLRQFRKLPEEVVKKIEKKNFP 

FERLYDLNHNEIGELIRMPKMGKTIHKYVHLFPK 

LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 

EAFWILVEDVDSEVILHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRWSDRWLSCETQLPVSFR 

HLBLPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFAELRMLLQNSEGRCVYITPMRLWQEQVY 

MDWYEICFQDRLNKKVVLLTGETSTDLKLLGKG 

NinSTPEKWDILSRRWKQRKNVQNINLFWDEV 

HLIGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 

QGFNISHTQTRLLSMAKPVFHAITKHSPKKPVIVF 

VPSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 

PYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQVVVASRSLCWGMNVAAHLVIIM 

DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGMIAAYYYINYTTIEL 

FSMSLNAKTKVRGLIEnSNAAEYENlPIRHHEDN 

LLRQLAQKVPHKLNNPKFNDPHVKTNLLLQAHL 

SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEVVDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWW 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inrntinn 

corresponding 
to flrst amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Aianine C=Cysteine, D=Aspartic Acid, 1 
£==Glutamic Acid, F^Phenylalanlne, G^Glycine, H^^Histidine, 
Wsoleucine, K"Lysine, L^Leucine, M=Methionine, 

N=Asnarflpinp P^Prnline. 0=diitflrninp^ RsArDinin^ fissfiprin^ 

T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UDknown, *'=Stop codon, A=possible nudeotide deletioa, 
^possible nucleotide inserdon 










IGDAKSNSLISKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3619 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 

DDDMEGDEAWRCTLSANMYVDEILVWCASEL 

NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSKRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKThrVALMCNILREIGKHINMDGTINVDDFKIIYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQirVCTPEKWDiriRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPVPLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVF\aiSRKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

fqvtelgriashyyitndtvqtynqllkptlseie 

lfrvfslssefknitvreeeklelqkllervpipvk 

esieepsakinvllqafisqlklegfalmadMvy 

vtqsagrlmraifeivlnrgwaqltdktlnlck 

midkrmwqsmcplrqfrklpeevvkkiekknfp 

ferlydlnhneigelbrmpkmgktihkyvhlfpk 

lelsvhlqpitrstlkveltitpdfqwdekvhgss 

eafwilvedvdsevilhheyfllkakyaqdehli 

tffvpvfeplppqyfirwsdrwlscetqlpvsfr 

hln.pekyppptelldlqplpvsalrnsafeslyq 

dkfpffnpiqtqvfntvynsddnvfvgaptgsgk 

■nCAEFAILRMLLQNSEGRCVYrrPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

NIIISTPEKWDILSRRWKQRKNVQNINLFVVDEV 

HLIGGENGPVLEVICSRMRYISSQffiRPIRIVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 

QGFMSHTQTRLLSMAKPVFHAITKHSPKKPVIVF 

VPSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 

ffYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQVWASRSLCWGMNVAAHLVIIM 

DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGMIAAYYYINYTTIEL 

FSMSLNAKTKVRGLIEnSNAAEYENIPIRHHEDN 

LLRQLAQKVPHKLNNPKFNDPHVKTNLLLQAHL 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locfitioil 

corresponding 

to tlrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
I>=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I^'IsoIeucine, K=«Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine) P=Proline, Q=GIutamine, R=Arginine, S»Serine, 
T=Threonine, V=VaJine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 

VWLVQLEREEEVTGPVIAPLFPQKREEGWWVV 

IGDAKSNSLISKRLTLQQKAKVKLDFVAPATGG 

RHNILYFMSDAYMGCDQEYKFSVDVKEAETDS 

DSD 


3620 


A 


1205 


323 


VIKMALAARLLPQFLHSRSLPCGAVRLRTPAVAE 

VRLPSATLCYFCRCRLGLGAALFPRSARALAASA 

LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 

EKPQQHQKTKMIVLGFSNPINWVRTRIKAFLIWA 

YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 

EELVAKEVLHALKEKVTSLPDNHKNALAANBDEI 

VFTSTGDISIYYDEKGRKFVNILMCFWYLTSANIP 

SETLRGASVFQVKLGNQNVETKQLLSASYEFQR 

EFTQGVKPDWTIARIEHSKLLE 


3621 


A 


2 


2995 


SSSRSRHSSISPVRLPLNSSLGAELSRKKKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRSSSRSRHSSISPVRLPLNSSLGAELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRLPLNSSLGAEL 

SRKKKERAAAAAAAKMDGKESKGSPVFLPRKE 

NSSVEAKDSGLESKKLPRSVKLEKSAPDTELVNV 

THLNTEVKNSSDTGKVKLDENSEKHLVKDLKAQ 

GTRDSKPIALKEEIVTPKETETSEKETPPPLPTIASP 

PPPLPTTTPPPQTPPLPPLPPIPALPQQPPLPPSQPA 

FSQVPASSTSTLPPSTHSKTSAVSSQANSQPPVQV 

SVKTQVSVTAAIPHLKTSTLPPLPLPPLLPGDDDM 

DSPKETLPSKPVKKEKEQRTRHLLTDLPLPPELPG 

GDLSPPDSPEPKAITPPQQPYKKRPKICCPRYGER 

RQTESDWGKRCVDKFDIIGIIGEGTYGQVYKAKD 

KDTGELVALKKVRLDNEKEGFPITAIREIKILRQL 

IHRSVVNMKEIVTDKQDALDFKKDKGAFYLVFE 

YMDHDLMGLLESGLVHFSEDHIKSFMKQLMEGL 

EYCHKKNFLHRDIKCSNILLNNSGQIKLADFGLA 

RLYNSEESRPYTNKVITLWYRPPKLLLGEERYTP 

AIDVWSCGCILGELFTKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVIKLPYFNTMKPKKQYRRRLR 

EEFSFIPSAALDLLDHMLTLDPSKRCTAEQTLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVVVEEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPQPAPGKVESGAGDAIGLADITQQLNQSELAVL 

LNLLQSQTDLSIPQMAQLLNIHSNPEMQQQLEAL 

NQSISALTEATSQQQDSETMAPEESLKEAPSAPVl 

LPSAEQTTLEASSTPADMQNILAVLLSQLMKTQE 

PAGSLEENNSDKNSGPQGPRRTPTMPQEEAAGRS 

NGGNAL 


3622 


A 


16 


390 


TPERGSAYPETAAVRRPAGECPITMSDLEAKLST 
EHLGDKIKDEDIKLRVIGQDSSEIHFKVKMTTPLK 
KLKKSYCQRQGVPVNSLRFLFEGQRIADNHTPEE 
LGMEEEDVIEVYQEQIGGHSTV 


3623 


A 


2 


1544 


PPPAPGPDGLNEGCLHRLSMPHQRPRTCAMhJPE 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I»Isoleucine, K=Lysine, I^Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^eriDC} 
T=Threonine, V=Valine, W=Tryptophan, V=Tyroslnc, 
X=Unkno>vn, "^^Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LTMESLGTLHGARGGGSGGGGGGGGGGGGGGP 

GHEQELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

ELGTAAAAAAAASRSAMVTSMASILDGGDYRPE 

LSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQP 

LPPISTVSDKFHHPHPHHHPHHHHHHHHQRLSGN 

VSGSFTLMRDERGLPAlVnvJNLySPYKEMPGMSQS 

LSPLAATPLGNGLGGLHNAQQSLPNYGPPGHDK 

MLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAM 

MSHLNGLHHPGHTQSHGPVLAPSRERPPSSSSGS 

QVATSGQLEEINTKEVAQRITAELKRYSIPQAIFA 

QRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRR 

MWWLQEPEFQRMSALRLAACKRKEQEPNKDR 

NNSQKKSRLVFTDLQRRTLFAIFKENKRPSKEMQ 

ITISQQLGLELTTVSNFFMNARRRSLEKWQDDLS 

TGGSSSTSSTCTKA 


3624 


A 


27 


2152 


SARKAEAATSGTAARDGSVGRNLVPPPSASAPK 

AEVESNEKDNRPEEEEQVIHEDDERPSEKNEFSR 

RKRSKSEDMDNVQSKRRRYMEEEYEAEFQVKIT 

AKGDINQKLQKVIQWLLEEKLCALQCAVFDKTL 

AELKTRVEKJECNKRHKTVLTELQAKIARLTKiRF 

EAAKEDLKKJmEHPPNPPVSPGKTVNDVNSNNN 

MSYRNAGTVRQMLESKRNVSESAPPSFQTPVNT 

VSSTNLVTPPAVVSSQPKLQTPVTSGSLTATSVLP 

APNTATVVATTQVPSGNPQPTISLQPLPVILHVPV 

AVSSQPQLLQSHPGTLVTNQPSGNVEFISVQSPPT 

VSGLTKNPVSLPSLPNPTKPNNVPSVPSPSIQRNP 

TASAAPLGTTLAVQAVPTAHSIVQATRTSLPTVG 

PSGLYSPSTNRGPIQMKIPISAFSTSSAAEQNSNTT 

PRIENQTNKTIDASVSKKAADSTSQCGKATGSDS 

SGVIDLTMDDEESGASQDPKKLNHTPVSTMSSSQ 

PVSRPLQPIQPAPPLQPSGVPTSGPSQTTIHLLPTA. 

PTTVNVTHRPVTQVTTRLPVPRAPANHQVVYTT 

LPAPPAQAPLRGTVMQAPAVRQVNPQNSVTVRV 

PQTTTYVVNNGLTLGSTGPQLTVHHRPPQVHTEP 

PRPVHPAPLPEAPQPQRLPPEAGSTSRPSEATLEV. 

SHAFRVKMAIVLVMECPGGGSKLCHC 


3625 


A 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQIT 

LQGSRRRQGRTAFPASGKKRETDYSDGDPLDVH 

KRLPSSTGEDRAVMLGFAMMGFSVLMFFLLGTT 

ILKPFMLSIQREESTCTAIHTDIMDDWLDCAFTCG 

VHCHGQGKYPCLQVFVNLSHPGQKALLHYNEE 

AVQINPKCFYTPKCHQDRNDLLNSALDIKEFFDH 

KNGTPFSCFYSPASQSEDVILIKKYDQMAIFHCLF 

WPSLTLLGGALIVGMVRLTQHLSLLCEKYSTVV 

RDEVGGKVPYIEQHQFKLCIMRRSKGRAEKS 


3626 


A 


9 


921 


SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEG 

FLSAEECVAMQQRIGEIVAEMDVPLHCRTEFSTQ 

EEEQLRAQGSTDYFLSSGDKERFFFEKGVFDEKG 

NFLVPPEKSINKIGHALHAHDPVFKSITHSFKVQT 

LARSLGLQMPVVVQSMYIFKQPHFGGEVSPHQD 

ASFLYTEPLGRVLGVWIAVEDATLENGCLWFIPG 

SHTSGVSRRMVRAPVGSAPGTSFLGSEPARDNSL 

FVPTPVQRGALVLIHGEVVHKSKQNLSDRSRQA 

YTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


3627 


A 


231 


644 


INSSPRTGRDHQELNLHTERDSRSQRAVLKIPRQ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne C=Cysteine, D^Aspartic Acid, 
&=Glutamic Add» F^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L^Leucine, M==Methionine, 
N^AsparaginC) P^Prolinej Q=^Glutamine) R^Arginine, S^Serine^ 
T=Threonine, V=Vaiinc, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=pos$ible nucleotide deletion, 
\»possible nucleotide insertion 










NPGIFYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LKMRNTFAELKNSLEALSSRMDQAEERIGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSS WDYRAC 
LS 


3628 


A 


2 


.810 


GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

DWLMGKSKAKPNGKKPAAEERKAYLEPEmXA 

RITDFQFKELVVLPREmLNEWLASNTTTFFHHIN 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

KVKCTAPQYVDFVMSSVQKLVTDEDVFPTKYG 

REFPSSFESLVRKICRHLFHVLAHIYWAHFKETLA 

LELHGHLNTLYVHFILFAREFNLLDPKETAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 


3629 


A 


699 


1604 


CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFREKNSSTNQHIIRLESLQAEIKMLSDRKRELEH 

RLSATLEENDLLQGTVEELQDRVLILERQGHDKD 

LQLHQSQLELQEVRLSCRQLQVKVEELTEERSLQ 

SSAATSTSLLSEIEQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KLNLSQQLEAWQDDMHRVIDRQLMDTHLKERS 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRKI 


3630 


A 


423 


1 


PAKVLTLDIYLSKTEGAQVDEPVVITPRAEDCGD 

WDDMEKRSSGRRSGRRRGSQKSTDSPGADAELP 

ESAARDDAVFDDEVAPNAASDNASAEKKVKSPR 

AALDGGVASAASPESKPSPGTKGQLRGESDRSK 

QPPPASSP 


3631 


A 


2082 


674 


WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 

PGAACPGCRGAGSERAPGGMGRRRAPELYRAPF 

PLYALQVDPSTGLLIAAGGGGAAKTGIKNGVHF 

LQLELINGRLSASLLHSHDTETRATMNLALAGDI 

LAAGQDAHCQLLRFQAHQQQGNKAEKAGSKEQ 

GPRQRXGAAPAEKKCGAETQHEGLELRVENLQA 

VQTDFSSDPLQKVVCFNHDNTLLATGGTDGYVR 

VWKVPSLEKVLEFKAHEGEIEDLALGPDGKLVT 

VGRDLKASVWQKDQLVTQLHWQENGPTFSSTP 

YRYQACRFGQVPDQPAGLRLFTVQIPHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEVVSCLDVSES 

GTFLGLGTVTGSVAIYIAFSLQCLYYVREAHGIV 

VTDVAFLPEKGRGPELLGSHETALFSVAVDSRCQ 

LHLLPSRRSVPVWLLLLLCVGLIIVnLLLQSAFPG 

FL 


3632 


A 


942 


40 


PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 

RGLNHTSLNRSPPFTPDTMTHCCSPCCQPTCCRT 

TCCRTTCWKPTTVTTCSSTPCCQPSCCVPSCCQP 

CCHPTCCQNTCCRTTCCQPTCVASCCQPSCCSTP 

CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 

HPTCYQTICFRTTCCQPTCCQPTCCRNTSCQPTCC 

GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCSTP 

CCQPTCGGSSCCSQTCNESSYCLPCCRPTCCQTT 

CYRTTCCRPSCCCSPCCVSSCCQPSCC 


3633 


A 


605 


3004 


GPEGYRGRRARHPSLGSTTGHCGGGRGAEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ 
RAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYE 
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SEQD) 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^'Glutamic Acid, F=Plienylalanine, G=Glycine, H^HisttdinCi 
t^Isoleucine, K=L)'sine, L=Leucjne, M^Metbionine, 
N=Asparagine, P=Proline, Q=Glulamlne, R'^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, "'^top codon, A=possible nudeotide ddetlioii, 
V^possible nucleotide insertion 










KQEGEKLPFLGLALSSRKNLCIHPEVTPLRFGKD 

VDGKCHSLTASYVRAQYQHDTSLPHCRFYEEFD 

AHGREVPLPAGIYNLDDLKALGRRQGWCPYFLA 

RYSILHANWVYSYHYLLDPKIADLVSKELARK 

AWVFDEAHNEDNVCEDSMSVNLTRRTLDRCQG 

NLETLQKTVLRIKETDEQRLRDEYRRLVEGLREA 

SAARETDAHLANPVLPDEVLQEAVPGSIRTAEHF 

LGFLRRLLEYVKWRLRVQHVVQESPPAFLSGLA 

QRVCIQRKPLRFCAERLRSLLHTLEITDLADFSPL 

TLLANFATLVSTYAKGFTIIIEPFDDRTPTIANPIL 

HFSCMDASLAIKPVFERFQSVnXSGTLSPLDIYPK 

ILDFHPVTMATFTMTLARVCLCPMIIGRGNDQVA 

ISSKFETREDIAVIRNYGNLLLEMSAVVPDGIVAF 

FTSYQYMESTVASWYEQGILENIQRNKLLFIETQ 

DGAETSVALEKYQEACENGRGAILLSVARGKVS 

EGIDFVHHYGRAVIMFGVPYVYTQSRILKARLEY 

LRDQFQIRENDFLTFDAMRHAAQCVGRAIRGKT 

DYGLMVFADKRFARGDKRGKLPRWIQEHLTDA 

NLNLTVDEGVQVAKYFLRQMAQPFHREDQLGL 

SLLSLEQLESEETLKRIEQIAQQL 


.3634 


A 


159 


384 


LKMSSKTASTNNIAQARRTVQQLRLEASIERIKV 
SKASADLMSYCEEHARSDPLLIGIPTSENPFKDKK 
TCIIL ' 


3635 


A 


5 


409 


TELSQLEKAHPPADMGRRKSKRKPPPKKKMTGT 

LETQFTCPFCNHEKSCDVKMDRARNTGVISCTV 
CLEEFQTPITCILGNLGFFQRVGRGLESGPCSSGP 
LCALVQGQSRPEEQVPPSDFCGVRRCRAGFQCQ 


3636 


A 


48 


282 


DHLKSCYQDSHEDPTKMKRFLFLLLTISLLVMVQ 
IQTGLSGQNDTSQTSSPSASSSMSGGIFLFFVANAI 
IHLFCFS 


3637 


A 


1 


1248 


ARAGSVVGSAAARGPPAGCRCERAARLPSSPAR 

RRRCDWVEDGAGRMEILMTVSKFASICTMGAN 

ASALEKEIGPEQFPVNEHYFGLVNFGNTCYCNSV 

LQALYFCRPFREKGLAYKSQPRKKESLLTCLADL 

FHSIATQKKKVGVIPPKKFITRLRKENELFDNYM 

QQDAHEFLNYLLNTIADILQEERKQEKQNGRLPN 

GNIDNENNNSTPDPTWVHEIFQGTLTNETRCLTC 

ETISSKDEDFLDLSVDVEQNTSiniCLRGFSNTET 

LCSEYKYYCEECRSKQEAHKRMKVKKLPMILAL 

HLKRFKYMDQLHRYTKLSYRWFPLELRLFNTS 

GDATNPDRMYDLVAVVVHCGSGPNRGHYIAIV 

KSHDFWLLFDDDIVEKIDAQAIEEFYGLTSDISKN 

SESGYILFYQSRD 


3638 


A 


11 


630 


PAGIPVSTISSDRRASTDLTRKMKPDETPMFDPNL 

LKEVDWSQNTATFSPAISPTHPGEGLVLRPLCTA 

DLNRGFFKVLGQLTETGVVSPEQFMKSFEHMBCK 

SGDYYVTWEDVTLGQIVATATLIIEHKFIHSCAK 

RGRVEDWVSDECRGKQLGNLLLSTLTLLSKKL 

NCYKITLECLPQNVGFYKKFGYTVSEENYMCRR 

FLK 


3639 


A 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHSSPL 

LPHAMKSPFYRCQNTTSVEKGNSAVMGGVLFST 

GLLGNLLALGLLARSGLGWCSRRPLRPLPSVFY 

MLVCGLTVTDLLGKCLLSPWLAAYAQNRSLRV 

LAPALDNSLCQAFAFFMSFFGLSSTLQLLAMALE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to lirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Aianine C-Cysteine, D=Aspartic Acid, 
E=Gtutaniic Acid, F=Phenylalanine, G^Gtycine, H^'HlstidiDe, 
I'^Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P^ProIinc, Q=Glntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, ^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










CWLSLGHPFFYRRHITLRLGALVAPWSAFSLAF 

CALPFMGFGKFVQYCPGTWCFIQMVHEEGSLSV 

LGYSVLYSSLMALLVLATVLCNLGAMRNLYAM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEELD 

HLLLLALMTVLFTMCSLPVIYRAYYGAFKDVKE 

KNRTSEEAEDLRALRFLSVISIVDPWIFIIFRSPVFR 

IFFHKIFIRPLRYRSRCSNSTNMESSL 


3640 


A 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 

AIEAIKLGSTAIGIQTSEGVCLAVEKRITSPLMEPS 

SIEKIVEIDAHIGCAMSGLIADAKTLIDKARVETQ 

NHWFTYNETMTVESVTQAVSNLALQFGEEDADP 

GAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFV 

QCDARAIGSASEGAQSSLQEVYHKSMTLKEAIKS 

SLIILKQVMEEKLNATNffiLATVQPGQNFHMFTK 

EELEEVIKDI 


3641 


A 


2 


1254 


PTGQGGRRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

SWPSCLCHGLISFLGFLLLLVTFPISGWFALKIVPT 

YERMIVFRLGRIRTPQGPGMVLLLPFIDSFQRVDL 

RTRAFNVPPCKLASKDGAVLSVGADVQFRIWDP 

VLSVMTVKDLNTATRMTAQNAMTKALLKRPLR 

EIQMEKLKISDQLLLEINDVTRAWGLEVDRVELA 

VEAVLQPPQDSPAGPNLDSTLQQLALHFLGGSM 

NSMAGGAPSPGPADTVEMVSEVEPPAPQVGARS 

SPKQPLAEGLLTALQPFLSEALVSQVGACYQFNV 

VLPSGTQSAYFLDLTTGRGRVGHGVPDGIPDVV 

VEMAEADLRALLCRELRPLGAYMSGRLKVKGD 

LAMAMKLEAVLRALK 


3642 


A 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPEKPRKH 

DSGAADLERVTDYAEEKEIQSSNLETAMSVIGDR 

RSREQKAKQER 


3643 


A 


94 


541 


RKERRRRRRRMEAWFVFSLLDCCALIFLSVYFII 

TLSDLECDYINARSCCSKLNKWVPELIGHTIVTV 

LLLMSLHWFIFLLNLPVATWNIYRYIMVPSGNM 

GVFDPTEIHNRGQLKSHMKEAMIKLGFHLLCFF 

MYLYSMILALIND 


3644 


A 


95 


2808 


TSCRHFPITSEDPLNYLLILTVERIYAYQALPLGFL 

FCSRDPVPEYLNHCGVKYVLISDRASFCALHIFFS 

PFRNVFRPAAGGGIAPPPRLWFQPSLSDAEMEIPK 

LLPARGTLQGGGGGGIPAGGGRVHRGPDSPAGQ 

VPTRRLLLPRGPQDGGPGRRREEASTASRGPGPS 

LFAPRPHQPSGGGGGGGDDFFLVLLDPVGGDVE 

TAGSGQAAGPVLREEAEEGPGLQGGESGANPAG 

PTALGPRCLSAVPTPAPISAPGPAAAFAGTVTIHN 

QDLLLRFENGVLTLATPPPHAWEPGAAPAQQPG 

CLIAPQAGFPHAAHPGDCPELPPDLLLAEPAEPAP 

APAPEEEAEGPAAALGPRGPLGSGPGVVLYLCPE 

ALCGQTFAKKHQLKMHLLTHSSSQGQRPFKCPL 

GGCGWTFTTSYKLKRHLQSHDKLRPFGCPAEGC 

GKSFTTVYNLKAHMKGHEQENSFKCEVCEESFP 

TQAKLGAHQRSHFEPERPYQCAFSGCKKTFITVS 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IHLRSHTGERPFLCDFDGCGWNFTSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYIHSKKHLQDVD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>=Aspartic Acid, 
£=Glutamic Acid, F=Pheny]aIanine, G=Glycine, H^^Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, M'^Methionine, 
N^Asparaginc, P=Proline, Q^GIutaminC) R'^Arginine, S^erine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib]e nucleotide insertion 










TWKSRCPISSCNKLFTSKHSMKTHMVKRHKVGQ 

DLLAQLEAANSLTPSSELTSQRQNDLSDAEIVSLF 

SDVPDSTSAALLDTALVNSGILTIDVASVSSTLAG 

HLPANNNNSVGQAVDPPSLMATSDPPQSLDTSLF 

FGTAATGFQQSSLNMDEVSSVSVGPLGSLDSLA 

MKNSSPEPQALTPSSKLTVDTDTLTPSSTLCENSV 

SELLTPAKAEWSVHPNSDFFGQEGETQFGFPNAA 

GNHGSQKERInTLITVTGSSFLV 


3645 


A 


2194 


1707 


TVSFHKTMASLKCSTVVCVICLEKPKYRCPACRV 

PYCSVVCFRKHKEQCNPETRPVEKKIRSALPTKT 

VKPVENKDDDDSIADFLNSDEEEDRVSLQNLKN 

LGESATLRSLLLNPHLRQLMVNLDQGEDKAKLM 

RAYMQEPLFVEFADCCLGIVEPSQNEES 


3646 


A 


85 


1948 


ERGGGKAAAAAAAAAAARALAASGQDPRPHPR 

APPWDDSGDDDEATTPADKSELHHTLKNLSLKL 

DDLSTCNDLIAKHGAALQRSLTELDGLKIPSESG 

EKLKVVNERATLFRITSNAMINACRDFLELAEfflS 

RKWQRALQYEQEQRVHLEETffiQLAKQHNSLER 

AFHSAPGRPANPSKSFIEGSLLTPKGEDSEEDEDT 

EYFDAMEDSTSFITVITEAKEDSRKAEGSTGTSSA 

DWSSADNVLDGASLVPKGSSKVKRRVRIPNKPN 

YSLNLWSIMKNCIGRELSRIPMPVNFNEPLSMLQ 

RLTEDLEYHHLLDKAVHCTSSVEQMCLVAAFSV 

SSYSTTVHRIAKPFNPMLGETFELDRLDDMGLRS 

LCEQVSHHPPSAAHYVFSKHGWSLWQEITISSKF 

RGKYISIMPLGAIHLEFQASGNHYYWRKSTSTVH 

NIIVGKLWIDQSGDIEIVNHKTNDRCQLKFLPYSY 

FSKEAARKVTGVVSDSQGKAHYVLSGSWDEQM 

ECSKVMHSSPSSPSSDGKQKTVYQTLSAKLLWK 

KYPLPENAENMYYFSELALTLNEHEEGVAPTDS 

RLRPDQRLMEKGRWDEANTEKQRLEEKQRLSR 

RRRLEACGPGSSCSSEE 


3647 


A 


46 


5007 


PTGDACVSTSCELASALSHLDASHLTENLPKAAS 

ELGQQPMTELDSSSDLISSPGKKGAAHPDPSKTS 

VDTGQVSRPENPSQPASPRVTKCKARSPVRLPHE 

GSPSPGEKAAAPPDYSKTRSASETSTPHNTRRVA 

ALRGAGPGAEGMTPAGAVLPGDPLTSQEQRQGA 

PGNHSKALEMTGIHAPESSQEPSLLEGADSVSSR 

APQASLSMLPSTDNTKEACGHVSGHCCPGGSRE 

SPVTDIDSFIKELDASAARSPSSQTGDSGSQEGSA 

QGHPPAGAGGGSSCRAEPVPGGQTSSPRRAWAA 

GAPAYPQWASQPSVLDSINPDKHFTVNKNFLSN 

YSRNFSSFHEDSTSLSGLGDSTEPSLSSMYGDAE 

DSSSDPESLTEAPRASARDGWSPPRSRVSLHKED 

PSESEEEQIEICSTRGCPNPPSSPAHLPTQAAICPAS 

AKVLSLKYSTPRESVASPREKVACLPGSYTSGPD 

SSQPSSLLEMSSQEHETHADISTSQNHRPSCAEET 

TEVTSASSAMENSPLSKVARHFHSPPIILSSPNMV 

NGLEHDLLDDETLNQYETSINAAASLSSFSVDVP 

KNGESVLENLHISESQDLDDLLQKPKMIARRPIM 

AWFKEINKHNQGTHLRSKTEKEQPLMPARSPDS 

KIQMVSSSQKKGVTVPHSPPQPKTNLENKDLSKK 

SPAEMLLTNGQKAKCGPKLKRLSLKGKAKVNSE 

APAANAVKAGGTDHRKPLISPQTSHKTLSKAVS 

QRLHVADHEDPDKiSrTTAAPRSPQCVLESKPPLAT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=GIycine, H=Histidine, 
I^Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y«Tyrosine, 
X=l)n known, *=Stop codon, /=pos5ible nucleotide deletion, 
V^possible nucleotide insertion 










SGPLKPSVSDTSIRTFVSPLTSPKPVPEQGMWSRF 

HMAVLSEPDRGCPTTPKSPKCRAEGRAPRADSG 

PVSPAASRNGMSVAGNRQSEPRLASHVAADTAQ 

PRPTGEKGGNIMASDRLERTNQLKIVEISAEAVSE 

TVCGNKPAESDRRGGCLAQGNCQEKSEIRLYRQ 

VAESSTSHPSSLPSHASQAEQEMSRSFSMAKLAS 

SSSSLQTAIRKAEYSQGKSSLMSDSRGVPRNSIPG 

GPSGEDHLYFTPRPATRTYSMPAQFSSHFGREGH 

PPHSLGRSRDSQVPVTSSVVPEAKASRGGLPSLA 

NGQGIYSVKPLLDTSRNLPATDEGDIISVQETSCL 

VTDKDCVTRRHYCYEQNWPHESTSFFSVKQRIKS 

FENLANADRPVAKSGASPFLSVSSKPPIGRRSSGS 

IVSGSLGHPGDAAARLLRRSLSSCSENQSEAGTL 

LPQMAKSPSIMTLTISRQNPPETSSKGSDSELKKS 

LGPLGIPTPTMTLASPVKKNKSSVRHTQPSPVSRS 

KLQELRALSMPDLDKLCSEDYSAGPSAVLFKTEL 

EITPRRSPGPPAGGVSCPEKGGNRACPGGSGPKT 

SAAETPSSASDTGEAAQDLPFRRSWSVNLDQLLV 

SAGDQQRLQSVLSSVGSKSTILTLIQEAKAQSENE 

EDVCFIVLNRKEGSGLGFSVAGGTDVEPKSITVH 

RVFSQGAASQEGTMNRGDFLLSVNGASLAGLAH 

GNVLKVLHQAQLHKDALVVIKKGMDQPRPSAR 

QEPPTANGKGLLSRKTIPLEPGIGRSVAVHDALC 

VEVLKTSAGLGLSLDGGKSSVTGDGPLVIKRVY 

KGGAAEQAGIIEAGDEILAINGKPLVGLMHFDA 

WNIMKSVPEGPVQLLIRKHRNSS 


3648 


A 


337 


1564 


ksrlsvtlmpvqlsehpewnesmhslrisvgglp 
vlasmtkaadprfrprwkvvltffvgaailwll 
cshrpapgrppthnahnwrlgqapanwyndty 
plsppqrtpagiryriaviadldtesraqeentwf 
tylkkgyltfsdsgdkvavewdkdhgvleshl 
aekgrgmelsdlivfngklysvddrtgvVyqie 
gskavpwvilsdgdgtvekgfkaewlavkder 

LYVGGLGKEWTTTTGDVVNENPEWVKVVGYK 

GSVDHENWVSNYNALRAAAGIQPPGYLIHESAC 

WSDTLQRWFFLPRRASQERYSEKDDERKGANLL 

LSASPDFGDIAVSHVGAWPTHGFSSFKFDPNTDD 

QIIVALKSEEDSGRVASYIMAFTLDGRFLLPETKI 

GSVKYEGIEFI 


3649 


A 


1 


775 


PTRPGSGSAGGARVGSGEFGVEMAALAPLPPLPA 

QFKSIQHHLRTAQEHDKRDPVVAYYCRLYAMQ 

TGMKIDSKTPECRKFLSKLMDQLEALKKQLGDN 

EAITQEIVGCAHLENYALKMFLYADNEDRAGRF 

HKNMIKSFYTASLLIDVITVFGELTDENVKHRKY 

ARWKATYIHNCLKNGETPQAGPVGDSEDNDffiEN 

EDAGAASLPTQPTQPSSSSTYDPSNMPSGNYTGI 

QIPPGAHAPANTPAEVPHSTGVAK 


3650 


A 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWELQVHFKIHGQGKKNLHGDGLAI 

WYTKDRMQPGPVFGNMDKFVGLGVFVDTYPNE 

EKQQERVFPYISAMVNNGSLSYDHERDGRPTEL 

GGCTAIVRJ^HYDTFLVmYVKRHLmiMDro 

HEWRDCIEVPGVRLPRGYYFGTSiSITGDLSDNHD 

VISLKLFELTVERTPEEEKLHRDVFLPSVDNMKL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L«Leucinc, M=Methionine, 
N=Asparagine, P=Proline, (>=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










PEMTAPLPPLSGLALFLIVFFSLVFSVFAIVIGIILY 
NKWQEQSRKRFY 


3651 


A 


1 


1218 


RSWAYVKKCKNNMCPNRGLHDGPEPCWLHHA 

AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 

PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 

QDFQNIQVSAAADAGSPPSRVSLAQGQGSGSPGC 

KPSLPAEAEGAAQELENQMKERQGLFFDMEAYL 

PKKNGLYLSLVLGNVNVTLLSKQAKFAYKDEYE 

KFKLYLTIDLILISFTCRFLLNSRVTDAAFNFLLVW 

YYCTLTIRESILINNGSRJKGWWVFHHYVSTFLSG 

VMLTWPDGLMYQKFRNQFLSFSMYQSFVQFLQ 

YYYQSGCLYRLRALGERHTMDLTVEGFQSWMW 

RVLTFLLPFLFFGHFWQLFNALTLFNLAQDPQCK 

EWQVLMCGFPFLLLFLGNFFTTLRVVHHKFHSQ 

RHGSKKD 


3652 


A 


640 


164 


VTTSCIIPFAFGLGVRASERLAEIDMPYLLKYQPM 

MQTIGQKYCMDPAVIAGVLSRKSPGDKILVNMG 

DRTSMVQDPGSQAPTSWISESQVFQTTEVLTTRI 

TELQRRFPTWTPDQYLRGGLCAYSGGAGYVRSS 

QDLSCDFCNDVLARAKYLKRHGF 


3653 


A 


2 


909 


IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNIGIRIKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRJKXETGQM 


3654 


A 


2 


909 


rVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNlGnUKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 


3655 


A 


2 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDTG 

MVAHINNSRLKAKGVGQHDNAQNFGNQSFEEL 

RAACLRKGELFEDPLFPAEPSSLGFKDLGPNSKN 

VQNISWQRPKDIINNPLFIMDGISPTDICQGILGDC 

WLLAAIGSLTTCPKLLYRWPRGQSFKKNYAGIF 

HFQIWQFGQWVNWVDDRLPTKNDKLVFVHST 

ERSEFWSALLEKAYAKLSGSYEALSGGSTMEGL 

EDFTGGVAQSFQLQRPPQNLLRLLRKAVERSSL 

MGCSIEVTSDSELESMTDKMLVRGHAYSVTGLQ 

DVHYRGKMETLIRVRNPWGRIEWNGAWSDSAR 

EWEEVASDIQMQLLHKTEDGEF WMS YQDFLNN 

FTLLEICNLTPDTLSGDYKSYWHTTFYEGSWRTG 

SSAGGCRNHPGTFWTNPQFKISLPEGDDPEDDAE 

GNVVVCTCLVALMQKNWRHARQQGAQLQTIGF 

VLYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEIF 

TNSREVSSQLRLPPGEYIIIPSTFEPHRDADFLLRV 

FTEBCHSESWELDEVNYAEQLQEEKVSEDDMDQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locdtioii 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>==Aspartic Acid, 
E=Glutamic Acid, P=PhenyIalanine, G=Glycine, H==Histidine, 
Msoleucine, K=Lysine, L«Lcucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R^^Arginine, S^erine, 
T=Threonine, V=VaIine, W=Tryptophan, y=TyrosiDe, 
X=llnknown, *=Stop codon, /=posslbIe nucleotide deletion, 
\=possible nucleotide insertion 










DFLHLFKIVAGEGKEIGVYELQRLLNRMAIKFKS 

FKTKGFGLDACRCMINLMDKDGSGKLGLLEFKI 

LWKKLKKWMDIFRECDQDHSGTLNSYEMRLVIE 

KAGIKLNNKVMQVLVARYADDDLIIDFDSFISCF 

LlU.KTMFrFFLTMDPKNTGHICLSLEQVLGEGW 

EGICRIAPACPSTPPPPSSDVPGPASCPRLFPPWDL 

LPVSTVAADDHVGIEAL 


3656 


A 


3 • 


174 


PLCTHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 


3657 


A 


1 


444 


DTRSTYHNAHSLPTYVKSPAPCQMTYIKSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

GSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCCENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GIIPMKSRSPALL 


3658 


A 


92 


1537 


SEAPVQPQPYTMTSFYSTSSCPLGCTMAPGARNV 

FVSPIDVGCQPVAEANAASMCLLANVAHANRVR 

VGSTPLGRPSLCLPPTSHTACPLPGTCHIPGNIGIC 

GAYGKimNGHEKETMKFLNDRLANYLEKVRQ 

LEQENAELETTLLERSKCHESTVCPDYQSYFRTIE 

ELQQKJLCSKAENARLIVQIDNAKLAADDFRIKL 

ESERSLHQLVEADKCGTQKLLDDATLAKADLEA 

QQESLKEEQLSLKSNHEQEVKILRSQLGEKFRIEL 

DIEPTIDLNRVLGEMRAQYEAMVETNHQDVEQ 

WFQAQSEGISLQAMSCSEELQCCQSEILELRCTV 

NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 

QMQSLISNLEEQLSEIRADLERQNQEYQVLLDVK 

ARLENEIATYRNLTPLQSLFHACLLYFLSKLWPC 

HRWVSLWPWSQHGEMILKARVRRLRLVALGSG 

VPSPCPVFLQD 


3659 


A 


2 


402 


DLLQCLNQLYSASTEMSCQQSQQQCQPPPKCTP 
KCPPKCTPKCPPKCPPKCPPQYSAPCPPPVSSCCG 
SSSGGCCSSEGGGCCLSHHRPRQSLRRRPQSSSC 
CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 


3660 


A 


26 


710 


CSAVEVKMAARTAFGAVCRRLWQGLGNFSVNT 

SKGNTAKNGGLLLSTNMKWVQFSNLHVDVPKD 

LTKPVVTISDEPDILYKRLSVLVKGHDKAVLDSY 

EYFAVLAAKELGISIKVHEPPRKIERFTLLQSVHI 

YKKHRVQYEMRTLYRCLELEHLTGSTADVYLEY 

IQRNLPEGVAMEVTKFCFFIFLDTIRTVTRTHQGA 

NLGNTIRRKRRKQVIKPQGGHFCLNLK 


3661 


A 


2 


370 


DVSVAASEPTVYRNPTKMSCQQNQQQCQPPPKC 
PIPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCCLSHHRHHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 


3662 


A 


205 


1277 


RKSLPHPNPQKMLKKPLSAVTWLCIFIVAFVSHP 

AWLQKLSKHKTPAQPQLKAANCCEEVKELKAQ 

VANLSSLLSELNKKQERDWVSVVMQVMELESN 

SKRMESRLTDAESKYSEMNNQIDIMQLQAAQTV 

TQTSAGKETSPLRERGVPPHLQHCFYIPPDDFLGS 

PELEVFCDMETSGGGWTIIQRRKSGLVSFYRDW 

KQYKQGFGSIRGDFWLGNEHIHRLSRQPTRLRVE 

MEDWEGNLRYAEYSHFVLGNELNSYRLFLGNY 

TGNVGNDALQYHNNTAFSTKDKDNDNCLDKCA 

QLRKGGYWYNCCTDSNLNGVYYRLGEHNKHLD 

GITWYGWHGSTYSLKRVEMKIRPEDFKP 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanlne C=Cysteine, D>=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^^Histidine, 
I-Isoleucine, K=Lysine, L»Leucine, M=Methionine, 
N=Asparagine, I^Prolinc, Q=GIutamlne, R^Arginine, S^erine, 
T=Threonine, V=Valinc, W=Tryptophan, Y^Tyrosine, 
X==lInl(nown, *=Stop codon, /=possible nucleotide deletion, 
\=pos5ible nucleotide insertion 


3663 


A 


64 


1456 


lssaketlaqmyntvwnmedldleyaktdinc 
gtdlnifyffimdppalppkppkpttvanngmnnn 
mslqdaewywgdisreevneklrdtAdgtflv 
rdastkmhgdytltlrkgg>jnklikifhrdgky 

GFSDPLTFSSVVELI^^ml^ffiSLAQYl^KLDVKL 
LYPVSKYQQDQVVKEDNIEAVGKKLHEYNTQFQ 
EKSREYDRLYEEYTRTSQEIQMKRTAIEAFNETIK 
IFEEQCQTQERYSKEYIEKFKREGNEKEIQRIMHN 
YDKLKSRISEIIDSRRRLEEDLKKQAAEYREIDKR 
MNSIKPDLIQLRKTRDQYLMWLTQKGVRQKKL 
NEWLGNENTEDQYSLVEDDEDLPHHDEKTWNV 
GSSNRNKAENLLRGKRDGTFLVRESSKQGCYAC 
SVVVDGEVKHCVINKTATGYGFAEPYNLYSSLK 
. ELVLHYQHTSLVQHNfDSLNVTLAYPVYAQQRR 


3664 


A 


944 


406 


GATVEDQSCNFGSLRWWSVPHISARSCPDPLLS 

RTGRVPGGRGAGLPRHHSPRCCLQVFFNGANVR 

QVDVPTLTGAFGILAAHVPTLQVLRPGLWVHA 

EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 

MLDLGAAKANLEKAQAELVGTADEATRAEIQIR 

lEANEALVKALE 


3665 


A 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSFHEH 

RHQSGRCLSTGMAPNLKGRPRKKKPCPQRRDSF 

SGVKDSNNNSDGKAVAKVKCEARSALTKPKNN 

HNCKKVSNEEKPKVAIGEECRADEQAFLVALYK 

YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 

GYETITARRQWKHIYDELGGNPGSTSAATCTRR 

HYERLILPYERFIKGEEDKPLPPIKPRKQENSSQE 

NENKTKVSGTKRIKHEIPKSKKEKENAPKPQDAA 

EVSSEQEKEQETLISQKSIPEPLPAADMKKKIEGY 

QEFSAKPLASRVDPEKDNETDQGSNSEKVAEEA 

GEKGPTPPLPSAPLAPEKDSALVPGASKQPLTSPS 

ALVDSKQESKLCCFTESPESEPQEASFPRLPHHTG 

HRWQTRMRRRMTNCPPWQITLPTAP 


3666 


A 


113 


1492 


LLQEMCTKTIPVLWGCFLLWNLYVSSSQTIYPGI 

KAmTQRALDYGVQAGMKMffiQMLKEKKLPDL 

SGSESLEFLKVDYVNYNFSNIKISAFSFPNTSLAF 

VPGVGIKALTNHGTANISTDWGFESPLFVLYNSF 

AEPMEKPILKNLNEMLCPIIASEVKALNANLSTLE 

VLTKIDNYTLLDYSLISSPEITENYLDLNLKGVFY 

PLENLTDPPFSPVPFVLPERSNSMLYIGIAEYFFKS 

ASFAHFTAGVFNVTLSTEEISNHFVQNSQGLGNV 

LSRIAEIYILSQPFMVRIMATEPPIINLQPGNFTLDI 

PASIMMLTQPKNSTVETIVSMDFVASTSVGLVIL 

GQRLVCSLSLNRFRLALPESNRSNIEVLRFENILSS 

ILHFGVLPLANAKLQQGFPLPNPHKFLFVNSDIEV 

LEGFLLISTDLKYETSSKQQPSFHVWEGLNLISRQ 

WRGKSAP 


3667 


A 


1 


181 


FRGRLGSGRNGGGSMNAPPAFESFLLFEGEKITIN 
KDTKVPNACLFTINKEDHTLGNnK 


3668 


A 


212 


431 


VAGEAVPFFPMMYSEPLKPSYLALVLWYFLLTG 
YCITKPEVIFKIEQGEEPWILEKGFPSQCHPAKYL 
WCLHD 


3669 


A 


458 


1056 


FSGVCFAGIAGSMATLLHDAVMNPAEVVKQRLQ 
MYNSQHRSAISCIRTVWRTEGLGAFYRSYTTQLT 
MNIPFQSIHFITYEFLQEQVNPHRTYNPQSHIISGG 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cy$teine, I>=Aspardc Acid, 
E=Glutamic Acid, F=Phcnylalanine, G^Glycinc, H=Histidine, 
I=IsoIeucine, K=Lysine, L^Leucine, M-Methionine, 
N'=Asparagine, P=ProIine, Q=Glutamine, R^^Arginine, S=^erine, 
T=Threonine, V=Valine,W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










LAGALAAAATTPLDVCKTLLNTQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKGIQARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 


3670 


A 


145 


298 


RNPCPLTFLPSTLMVLLLSLTT'FSALTFHSICQLRN 
TGVEVDIVFQRVSFL 


3671 


A 


3 


462 


ILKVAKKERTMSSLPVPYKLPVSLSVGSCVIIKGT 

PIHSFINDPQLQVDFYTDMDEDSDIAFRFRVHFG 

NHVVMNRREFGIWMLEETTDYVPFEDGKQFELC 

lYVHYNEYEIKVNGHTHLRALSHRIPPSFVEDGC 

KCPRRYLPWTSVCVCN 


3672 


A 


1 


1028 


HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLKTELGSFFTEYLQNQLLTKGM 

VILRDKIRFYEGQKLLDSLAETWDFFFSDVLPML 

QAIFYPVQGKEPSVRQLALLHFRNAITLSVKLED 

ALARAHARVPPAIVQMLLVLQGVHESRGVTEDY 

LRLETLVQKVVSPYLGTYGLHSSEGPFTHSCILEK 

RLLRRSRSGDVLAKNPVVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGTFRSSPAPHSGPCPSRLYPTTQPPEQGLD 

PTRS 


3673 


A 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENffiDIPKEVLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKKREMDEIJISYSSLMKVENMSSNQDG 

NDSDEFM 


3674 


A 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKIVINNVNVWTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3675 


A 


921 


1321 


VTLAKMRVfflSSCLKVQEQMANCPKFVPVVPTS 
QPIPSNIPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDPNRVVCPICSAMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 


3676 


A 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRR 

RRIVQSRYTRKAWQSLELKGITKHALNHHPPPEK 

LEEISPTSDSHEKDTSSQSKSDITRESSFTSADTGN 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAIDELLYEQKLSVHTKSLQEECQQWT 

ASFPHLRILGRQETPSEGYRLYPRSPSAVSASYET 

TLSQERDSTIFGIRGKKLHFSSSYAHKASSIAKSSS 

FCSMERDEEDSIIVSEGIIEEYLAFDHIDIEEGFHG 

KKSEAATEKQKLGYPPIAPFYCMKEDVLAYVFD 

SVWCKVVSCMEQLTRSHWEGFASDDESNVAVT 

RPDSESSCVLSELHPLVLPRVPQSKVLYITSNPMS 

LCQASRHQPNVNDLLVHGMPLQPRNLSLMDKLL 

DLDDKLLMRPGSSTILSTRNWPNRAVEFSTSSLS 

YTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEIL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locfltion 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F«Phenylalanine, G=Glycine, H=Histidine, 
l^Isoleucine, K=Lysine, L=Leucine, M=i>1ethiooine,* 
N^^AsparaginC) P=Proline, Q=Glutamine, R=Arginine, S^erinSt 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosinc, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










RGARVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVE 

HVSTVGPQRQMKPHGDSSRAQSAVVDEPNYQQ 

PQERLLLPDFFPRPNTTQSFLLDTQYRRSCAVEYP 

HQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 

P 


2611 


A 


246 


757 


MRLQGAIFVLLPHLGPILVWLFTRDHMSGWCEG 

PRMLSWCPFYKVLLLVQTAIYSVVGYASYLVWK 

DLGGGLGWPLALPLGLYAVQLTISWTVLVLFFT 

\nHGWGLALLHLLLLYGLVVSTALIWHPINKLAAL 

LLLPYLAWLTVTSALTYHLWRDSLCPVHQPQPT 

EKSD 


3678 


A 


20 


1508 


RGKAEFFLAMAGTNALLMLENFIDGKFLPCSSYI 

DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 

SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 

DQGKTLALARTMDIPRSVQNFRFFASSSLHHTSE 

CTQMDHLGCMHYTVRAPVGVAGLISPWNLPLY 

LLTWKIAPAMAAGNTVIAKPSELTSVTAWMLCK 

LLDKAGVPPGWNIVFGTGPRVGEALVSHPEVPL 

ISFTGSQPTAERITQLSAPHCKKLSLELGGKNPAn 

FEDANLDECIPATVRSSFANQGEICLCTSRIFVQK 

SIYSEFLKRPVEATRKWKVGIPSDPLVSIGALISK 

AHLEKVRSYVKRALAEGAQIWCGEGVDKLSLPA 

RNQAGYFMLPTVITDIKDESCCMTEEIFGPVTCV 

VPFDSEEEVIERANNVKYGLAATVWSSNVGRVH 

RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 

GREGAKDSYDFFTEIKTITVKH 


3679 


A 


1862 


502 


MAGTKPYMEIQTTIREYYEHLYANKLENLEEMD 

KFLDTYTLPRLNQEEVESLNRPITGSEIEAIINSLP 

TKKIPGPDRFTAKFYQRYKEELSNLIHYLGLSHH 

LLALNFIIVSFGKKSAWSSAQVKVTDTDFDGVEV 

RVFEGPPKPEEPLKRSVVYIHGGGWALASAKIRY 

YDELCTAMAEELNAVIVSIEYRLVPKVYFPEQIH 

DVVRATKYFLKPEVLQKYMVDPGRICISGDSAG 

GNLAAALGQQFTQDASLKNKLKLQALIYPVLQA 

LDFNTPSYQQNVNTPILPRYVMVKYWVDYFKG 

NYDFVQAMIVNNHTSLDVEEAAAVRARLNWTS 

LLPASFTKNYKPVVQTTGNARIVQELPQLLDARS 

APLIADQAVLQLLPKTYILTCEHDVLRDDGIMYA 

KRLESAGVEVTLDHFEDGFHGCMIFTSWPTNFSV 

GIRTRNSYIKWLDQNL 


3680 


A 


249 


2146 


RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFV 

LFLFLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

LMLEAMNNLRDSMPKLQIRAPEAQQTLFSINQSC 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 

SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSL 

GPDTRPPECVDQKFRRCPPLATTSVIIVFHNEAWS 

TLLRTVYSVLHTTPAILLKEnLVDDASTEEHLKE 

KLEQYVKQLQVVRWRQEERKGLITARLLGASV 

AQAEVLTFLDAHCECFHGWLEPLLARIAEDKTV 

VVSPDIVTIDLNTFEFAKPVQRGRVHSRGNFDWS 

LTFGWETLPPHEKQRRKDETYPIKSPTFAGGLFSI 

SKSYFEHIGTYDNQMEIWGGENVEMSFRVWQC 

GGQLEnPCSVVGHVFRTKSPHTFPKGTSVIARNQ 

VRLAEVWMDSYKKIFYRRNLQAAKMAQEKSFG 

DISERLQLREQLHCHNFSWYLHNVYPEMFVPDL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

loc&tioii 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
locatioii 
correspondins 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F==Phenylalanine, G=Glycinc, H=HiStldine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q^Glutamioe, R=Arginine, S'Scrine, 
T=Tlireonine, V=Valine, W=Tryptophan, y=Tyrosine, 
X=l)nkno\vn, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










TPTFYGAIKNLGTNQCLDVGENNRGGKPLIMYS 
CHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKG 
ALGLGSCHFTGKNSQVPKDEEWELAQDQLIRNS 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 


3681 


A 


2982 


.1869 


LKDTLKSQMTQEASDEAEDMKEAMNRMIDELN 

KQVSELSQLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTNVSRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENSVSITEHLQVITTLRT 

AAKEMEEKISNLKEHLASKEVEVAKLEKQLLEE 

KAAMTDAMVPRSSYEKLQSSLESEVSVLASKLK 

ESVKEKEKVHSEWQIRSEVSQVKREKENIQTLL 

KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK 

LEEDKDKKINEMSKEVTKLKEALNSLSQLSYSTS 

SSKRQSQQLEALQQQVKQLQNQLAECKKQHQE 

VISVYRMHLLYAVQGQMDEDVQKVLKQILTMC 

KNQSQKK 


36S2 


A 


447 


1024 


AQALTAGRQLALAAPFIAPISPISLPRLNPPSQSW 

NSTPFFKVKLPPQKEVITSDELMAHLGNCLLSIKP 

QEKSEGLQLNFQQNVDDAMTVLPKLATGLDVN 

VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 

QSPEAVRAVGKLSYNQL/VGEDHHLQTLQ*HQP 

RDRKPDCRAVPGDHRGPSDLPRTV 


3683 


A 


2 


942 


LEIKQEEKFVGQCIKEELMHGECVKEEKDFLKKE 

IVDDTKVKEEPPINHPVGCKRKLAMSRCETCGTE 

EAKYRCPRCMRYSCSLPCVKKHKAELTCNGVRD 

KTAYISIQQFTEMNLLSDYRFLEDVARTADHISR 

DAFLKRPISNKYMYFMKNRARRQGINLKLLPNG 

FTKRKENSTFFDKKKQQFCWHVKLQFPQSQA\ST 

♦KKRVPDDKTINEILKPYIDPEKSDPVIRQRLKAYI 

RSQTGVQILMKIEYMQQNLVRYYELDPYKSLLD 

NLRNKVIIEYPTLHVVLKGSNNDMKVLHQVKSE 

STKNVGNEN 


3684 


A 


119 


1533 


SLQENVQEKRVRVCPGLGGLLPNGTPSITAAAAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVPTEHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAW 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHDGSH*HHCAAHRRPVTRRQAAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

*GGDLTPVPDGPHDCPRDVQGIPGAGGGSQLAPC 

CPPFPAAPVSVQGTQGLGPKNVLH*QWEGIRWQ 

KEPE/PGPPPEVELKRGAKCRIGDHGLGAVLGQG 

EYAS*SPSIPW»ASSSACPPLHPTT/TVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHQL 

YSLP*LCRECCSCP/PPPPAHGGRCPSLLPPEALAK 

LLL 


3685 


A 


101 


438 


AWVLQCKINTELQTEVVMLKSMVLWLGEQVQS 

LQLQQQLHCHFNHTHICVTNLEYNMCEYPWDLV 
KAHLQGASTSNITFDIGELQKKMLDLNKQTQEFQ 
PSL*AWTEFQQGLE 


3686 


A 


105 


845 


VSDVVKNQLVEVQCRQDGCDAVENVHQMFMF 
NWFTDCLWTLFLSNYQPSVESSSPGGSATSDDHE 
FDPSADMLVHDFDDERTLEEEEMMEGETNFSSEI 
EDLAREGDMPIHELLSLYGYGSTVRLPEEDEEEE 
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SEQ ID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corres pon d 1 ng 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H^Hktidlne, 
I^'IsoIeucine, K-^Lysine, L^Leuciae, M-^Metbionine, 
N=AsparagiDe, P=Proline, Q=Gli]tfliiiine, R— Arginine, &^eriiic, 
T=Tbreonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codoD, /^possible nacleottde deletion, 
\=possible nucleotide insertion 










EEEEEGEDDEDADNDDNSGCSGENKEENIKDSS 
GQEDETQSSNDDPSQSVASQDAQEimPRRCKYF 
DTNSEVEEESEEDEDYEP/SnSFFQSSDGPSSSSSE 
DWKKEIMVGS 


3687 


A 


49 


1225 


PVLVTSLRMREADTLRPPQLMEVSADHSTVEFN 

HTGELLATGDKGGRWIFQREPESF^IAPHSQGE 

YDVYSTFQSHEPEFDYLKSLEIEEKINKIKWLPQQ 

NAAHSLLSTNDKTIKLWKITERDKRPEGYNLKDE 

EGKLKDLSTVTSLQVPVLKPMDLMVEVSPRRIFA 

NGHTYHINSISVNSDCETYMSADDLRINLWHLAI 

TDRSFTPVNTVDIKPANMEDLTEVITASEFHPHHC 

NLFVYSSSKGSLRLCDMRAAALCDKHSKLFEEPE 

DPSNRSFFSEnS\SVSDVKFSHSDRYMLTR\DyLT 

VKVWDLNMEARPBETYQVHDYLRSKLCSLYEND 

CIFDKFECAWNGSDR/IIMTGAYNNFFRMFDRNT 

KRDVTLEASRGSSKPRAVL 


3688 


A 


1 


401 


KKVPGRLSEMSFSLNFTLPANTTSSPVRDCGPSL 
GLAAGIPLLVATALLVALLFTLIHRRRSSIEAMEE 
SDRPCEISEroDNPKISENPRRSPTHEKNTMGAQE 
AHIYVKTVAGSEEPVHDRYRPTIEMERRR 


3689 


A 


698 


889 


GRVLVHCAMGVSRSATLVLAFLMIYENMTLVEA 
IPDGAGPPQISALTQAFVRQLQVLDNRLGRE 


3690 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3691 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3692 


A 


3 


2831 


PLVRRLLRQTLRRVGGARAVREAVMRAVLTWR 

DKAEHCINDIAFKPDGTQLILAAGSRLLVYDTSD 

GTLLQPLKGHKDTVYCVAYAKDGKRFASGSAD 

KSVnWTSKLEGILKYTHNDAIQCVSYNPITHQLA 

SCSSSDFGLWSPEQKSVSKHKSSSKIICCSWTNDG 

QYLALGMFNGIISIRNKNGEEKVKIERPGGSLSPI 

WSICWNPSSRWESFWMNRENEDAEDVIVNRYIQ 

EIPSTLKSAVYSSQGSEAEEEEPEEEDDSPRDDNL 

EERNDELAVADWG\QKVSFYQLSGKQIGKDRAL 

NFDPCCISYFTTCGEYILLGGSDKQVSLFTKDGVR 

LGTVGEQNSWVWTGQAKPDSNYWGGCQDGTI 

SFYQLIFSTVHGLYKDRYAYRDSMTDVIVQHLIT 

EQKVRIKCKELVKKIAIYRNRLAIQLPEKILIYELY 

SEDLSDMHYRVKEKIIKKFECNLLVVCANHIILC 

QEKRLQCLSFSGVKEREWQMESLIRYIKVIGGPP 

GREGLLVGLKNGQE,KFVDNLFAIVLLKQATAV 

RCLDMSASRKKLAVVDENDTCLVYDIDTKELLF 

QEPNANSVAWNTQCEDMLCFSGGGYLNIKASTF 

PVHRQKLQGFWGYNGSKBFCLHVFSISAVEVPQ 

SAPMYQYLDRKLFKEAYQIACLGVTDTDWRELA 

MEALEGLDFETAKKERKKRGETNNDLFLADVFS 

YQGKFHEAAKLYKRSGHENLALEMYTDLCMFE 

YAKDFLGSGDPKETKMLITKQADWARNIKEPKA 

AVEMYISAGEHVKAIEICGDHGWVDMLIDIARK 

LDKAEREPLLLCATYLKKLDSPGYAAETYLKMG 

DLKSLVQLHVETQRWDEAFALGEKHPEFKDDIY 

MPYAQWLAENDRFEEAQKAFHKAGRQREAVQV 

LEQLTNNAVAESRFNDAAYYYWMLSMQCLDIA 

QDPAQKD 


3693 


A 


3 


1099 


SSFPTCMRTVFHSNTSVSSLLHRPGHVTPQLTIHG 
GWRHHRDHTAIDEWDFNPSKFLIYTCLLLFSVLL 



413 



wo 01/57190 



PCTAJSOl/04098 



S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystelnc, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIaninc, G^Glycine, H-Hlstidine, 
I'^'Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=XJn known, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










PLRLDGIIQWSYWAVFAPIWLWKLLWAGASVG 

AGVWARNPRYRTEGEACVEFKAMLIAVGIHLLL 

LMFEVLVCDRVERGTHFWLLVFMPLFFVSPVSV 

AACVWGFRHDRSLELEILCSVNELQFIFIALKLDRI 

IHWPWLWFVPLWILMSFLCLWLYY1VWSLLFL 

RSLDVVAEQRRTOVTMAISWITIVVPLLTFEVLL 

VHRLDGHNTFSYVSIFWLWLSLLTLMATTFRRK 

GGNHWWFAIRRDF/CQDQLPQPTGKPPPPPLTDH 

HGEKALPLQNKDRGSWPASRGSPRLL 


3694 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3695 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3696 


A 


456 


733 


LSAALWEEPILSLWSETKELTNRGKMNYPQIGPH 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
H\GGEGQGPGKRAGHLGRGGGMSFL 


3697 


A 


877 


.1873 


VWL*TLS*HTCALMTVCRSCLVKYLEENNTCPT 
CRIVIHQSHPLQYIGHDRTMQDIVYKLVPGLQEA 
EMRKQREFYHKLGMEVPGDIKGETCSAKQHLDS 
HRNGETKADDSSNKEAAE 


3698 


A 


1 


572 . 


KQCGIPHEVVRDENSSVYAEVSRLLLATGHWKR 

LRRDNPRFNLMLGERNRLPFGRLGHEPGLVQLV 

NYYRGADKLCRKASLVKLIKTSPELAESCTWFPE 

SYVIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 

ASYNRKKEDGEGNVWIAKSSAGAKVWVQW*M 

TDLEEEIDIPSPVGLGLESEWPL 


3699 


A 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 

HHLQPVQVLQTLLHSATAGTGCRRPARPPPAPPT 

PTPWRSRQSGKQSERAS*LKGRGRYGLGALGGR 

GGRALGGSRWPPPLPGETLFSGCKHRRRRRGSD 

AAPGEEAGT 


3700 


A 


33 


1318 


GYQIGMALASGPARRALAGSGQLGLGGFGAPRR 

GAYEWGVRSTRKSEPPPLDRVYEIPGLEPITFAG 

KMHFVPWLARPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKDQACYIFHHRCRLLEGVKQALWLTKTKL 

lEGLPEKVLSLVDDPRNHffiNQDECVLNVISHARL 

WQTTEEIPKRETYCPVIVDNLIQLCKSQILKHPSL 

ARRICVQNSTFSATWNRESLLLQVRGSGGARLST 

KDPLPTIASREEIEATKNHVLETFYPISPIIDLHECN 

lYDVKNDTGFQEGYPYPYPHTLYLLDKANLRPH 

RLQPDQLRAKMILFAFGSALAQARLLYGNDAKV 

LEQPVWQSVGTDGRVFHFLVFQLNTTDLDSNE 

GVKNLAWVDSDQLLYQHFWCLPVIKKRVVVEP 

VGPVGFKPETFRKFLALYLHGAA 


3701 


A 


86 


465 


WTLCGPEAGMVGYDPKPDGRNNTKFQVAVAGS 
VSGLVTRALISPFDVIKIRFQLQHERLSRSDPSAK 
YHGILQASRQILQEEGPTAFWKGHVPAQILSIGY 
GAVQFLSFEMLTELVHRGSVYDARE 


3702 


A 


166 


814 


GFWEKTNQSSHSMDPLGAPSQFVDVDTLPSWGD 
SCQDELNSSDTTAEEFQEDTVRSPFLYNKDVNGK 
VVLWKGDVALLNCTAIVNTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
RFIIHTVGPKYKSRYRTAAESSLYSCYR>IVLQLA 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C^Cysteine, D^Aspartic Acid, 
£=GIutaniic Add, F=»Phenylalanine, C=Glycinc, H=Htstidjne, 
I=Isoleucine, K=Lysine, Lr=Leucine, M^Methionine, 
IV=Asparagine, P=Proline, Q==Glutaratne, R=Arginine, S=Serine» 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Un known, *=Stop codon, /=^possible nucleotide deletion, 
\~possible nucleotide insertion 










KEQSMSSVGFCVINSAKRGYPLKDATHIALRTVR 
RFLEIHGETIEKVV 


3703 


A 


128 


1255 


SLGPSPKSATIPCCGDTMAPEEDAGGEALGGSFW 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERARIE 

KAYAQQLADWARKWRGTVEKGPQYGILEKAW 

HAFFTAAERLSALHLEVREKLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEDGFRJKAQKPWLKRL 

KEVEASKKSYHAARKDEKTAQTRESHAKADSA 

VSQEQLRKLQERVERCAKEAEKTKAQYEQTLAE 

LHRYTPRYMEDMEQAFETCQAAERQRLLFFKD 

MLLTLHQHLDLSSSEKFHELHRDLHQGBEAASDE 

EDLRWWRSTHGPGMAMNWPQFEEWSLDTQRTI 

SRKEKGGRSPDEVTLTSIVPTRDGTAPPPQSPGSP 

GTGQDEEWSDEESP 


3704 


A 


1 


271 


ARGEDLALATGGGPDTVTHSNMPCPNSLVYDC 

WLNIKECSVGEHTFEDLGLCPGRNQREKKRSYK 

DFLREEEKIAAQVRNSSKKKLKDSE 


3705 


A 


170 


1318 


LNWANLVIMWPREEEKEKVQDYSLGGLSPDLRI 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

QFTCPQCRKSFTRRSFRPNLQLANMVQIIRQMCP 

TPYRGNRSNDQGMCFKHQEALKLFCEVDKEAIC 

VVCRESRSHKQHSVLPLEEVVQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSQMKSELA 

AVASEFGRLTRFLAEEQAGLERRLREMHEAQLG 

RAGAAASRLAEQAAQLSRLLAEAQERSQQGGLR 

LLQDIKETFNRCEEVQLQPPEVWSPDPCQPHSHD 

FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 

MLSPDRRGVRLAERRQEVADHPKRFSADCCVLG 

AQGFRSGRHYWEVCMGP 


3706 


A 


204 


1996 


SRERQTTWMDHNFAPAPPEMQSHGAPGPGTSFS 

HSHVLGRPIRPSRLPGGGSPLTPVLRKTIHLDTFP 

QSHIPQTSSRLGLGARTRSVPPQETGIALGASLSP 

LPTSSLVPRKLSSISLTLHQNSQARSLDRPLSHWE 

ELPTPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 

SEGPGNPGLTKSNRMLATEKPLVSSYLALPFQSR 

LAQSAPVLAEPGSLGQGHLVSVTDHMPTRASPG 

KGKPRARGPRPRGRLQRANTTVNLTAMDTRTD 

AARHLATMATNRPSLAINLATPNTSQLDTGTEFP 

ALDIKLGTARDLSSVGTVKSGKTVNLATAGTIKP 

GTAMNLTTVGTTKPGMVMDLIASEPDKLGKAM 

ATRSTAKPDMTTEGIAMDSATSDPVKPDTITATV 

GTSRLETAMALARVNRAKLGTAKNSLALDTSR 

MGTAVGSVVPVTPDPATGKTTLGSVNNLTISDV 

ATCLLMPSRSTDLALDNTNAAMDRATEPASLDL 

ATEYKGKCRNLVGDGLGCRBGEVCELGDGSMK 

PMSmSNLLGYIGIDTIffiQMRKKTMKTGFDFNIM 

WGTEGCGAAAGLVAGSTKDPISFPQ 


3707 


A 


3 


549 


SSSISRDFLGQAACASGTMLRWLRDFVLPTAACQ 

DAEQPMRYETLFQALDRNGDGVVDIGELQEGLR 

NLGIPLGQDAEEKIFTTGDVNKDGKLDFEEFMKY 

LKDHEKKMKLAFKSLDKNNDGKIEASEIVQSLQ 

TLGLTISEQQAELILQSIDVDGTMTVDWNEWRD 

YFLFNPVTDIEEIIR 


3708 


A 


1 


1866 


EFRGAGRANMLAPRGAAVLLLHLVLQRWLAAG 
AQATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'=Alanine C=Cysteine, D^Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I^Isoleucine, K=Lysine, Lr=Lcucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=*Argininc, S=Scrine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide Insertion 










LYVISTFKLQTKSSATIFGLYSSTDNSKYFEFTVM 

GRLSKAILRYLKNDGKVHLVVFNNLQLADGRRH 

RILLRLSNLQRGAGSLELYLDCIQVDSVHNLPRA 

FAGPSQKPEHELRTFQRKPQDFLEELKLVVRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTVVPPASPAPPTRPPRRCPSNPCF 

RGVQCIDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHCINLSPGFRCDACPVGFTGPMVQ 

GVGISFAKSNKQVCTDIDECRNGACVPNSICVNT 

LGSYRCGPCKPGYTGDQIRGCKAERNCRNPELN 

PCSVNAQCIEERQGDVTCVCGVGWAGDGYICGK 

DVDIDSYPDEELPCSARNCKKDNCKYVPNSGQE 

DADRDGIGDACDEDADGDGILNEQDNCVLIHNV 

DQRNSDKDIFGDACDNCLSVLNNDQKDTDGDG 

RGDACDDDMDGDGIKNILDNCPKFPNRDQRDK 

DGDGVGDACDSCPDVSNPNQ 


3709 


A 


144 


417 


TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
VLLIVGDQKFRAHKNVLAASSEYFQSLFTNKENE 
SQTVFQLDFCEPDAFDNVLNYIY 


3710 


A 


245 


688 


FGMLKNKGHSSKKDNLAVNAVALQDHILHDLQ 

LRNLSVADHSKTQVQKKENKSLKRDTKAIIDTGL 

KKTTQCPKLEDSEKEYVLDPKPPPLTLAQKLGLI 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCPICKE 

EFELRPQVFSIRG 


3711 


A 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRS 
TPAMMNGQGSTTSSSKNIAYNCCWDQCQACFNS 
SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 
PSTSQSWLQRHMLTHSGDKPFKCVVGGCNASFA 
SQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSK 
AGMNKRRKLKNKRRRSLARPHDFFDAQTLDAIR 
HRAICFNLSAHIESLGKGHSVVFHSTVSILLFFQIK 
YKTLQKNISTIISKSLKI 


3712 


A 


2 


344 


RATWHNAGKEREAVQLMAGAEKRVKASHSFLR 
GLFGGNTRIEEACEMYTRAANMFKMAKNWSAA 
GNAFCQAAKLHMQLQSKHDSATSFVDAGNAYK 
KADPQGKTARHVACYLCV 


3713 


A 


20 


974 


GAAATACSSSSSSSGAPATWAAHGPGKDVASPS 

SVSLSPRRSRLLVLRCGLRRNPERPSSSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGPIVVSGKVMSRRAPGSRLSSGGGGGG 

TNYSRSWNDWQPRTDSASADPGNLKYSSSRDRG 

GSSSYGLQPSNSAWSRQRHDDTRVHADIQNDE 

KGGYSVNGGSGENTYGRKSLGQELRVNNVTSPE 

FTSVQHGSRALATKDMRKSQERSMSYCDESRLS 

YLLRRITRENDRDRRLATVKQLKEFIQQPENKLV 

LVKQLDILAAVHDVLNER 


3714 


A 


237 


458 


IFALKSPSYLLPCCTPEGKMDHKQLCWSHPQKSG 
QSSRSCCICSNQHGLIWKYSLNMCLQCCHQYVK 
DIGFEKL 


3715 


A 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGFYH 

EAVVLFTQALKLNPQDHRLFGNRSFCHERLGQP 

AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 

RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 

QGQRGGICAPPLSPGALQPLPHAELAPSGLPSLRC 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C-Cysteine, D^^Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaIanine, G=Glycine, H^Histidine, 
I»Isoleucine, K=Lysine, L^Leucine, M-Methiooine, 
N=Asparagine, P=ProIine, Q==Glutamine, R=Arginine, S^erine» ' 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=pos5ible nudeotide insertion 










PRSTALRSPGLSPLLH 


3716 


A 


85 


308 


QGLPSTMVKLGCSFSGKPGKDPGDQDGAAMDS 

VPLISPLDISQLQPPLPDQWIKTQTEYQLSSPDQQ 

NYTKSR 


3717 


A 


58 


618 


GAGCTSPGLWARKAAARCLPTYPSRAQPSNVGR 

RRRRRPGLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEWD 

SNPYSRLMALKRMGIVSDYEKIRTFAVAIVGVGG 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFQPHQAGLSKVQAAGHTPEE 


3718 


A 


3 


593 


RGAGGRAGGRADGQPNMADQRQRSLSTSGESL 

YHVLGLDKNATSDDIKKSYRKLALKYHPDKNPD 

NPEAADKFKEINNAIMILTDATKRNIYDKYGSLG 

LY VAEQFGEENVNTYFVLSS WWAKALF VFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETEFY 

VSPEDLEAQLQSDEREATDTPIVIQPASATEP 


3719 


A 


2 


2173 


SGGVRMGSRADGPRTSGHVTGKMAVFPWHSRN 

RNYKAEFASCRLEAVPLEFGDYHPLKPITVTESK 

TKKVNRKGSTSSTSSSSSSSWDPLSSVLDGTDPL 

SMFAATADPAALAAAMDSSRRKRDRDDNSVVG 

SDFEPWTNKRGEILARYTTTEKLSINLFMGSEKG 

KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 

LTQQDYVNRIEELNQSLKDAWASDQKVKAPKN 

VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 

AKETCLNWFFKIASIRELIPRFYVEASILKCNKFLS 

KTGISECLPRLTCMIRGIGDPL\GSVYARAYL\SRV 

GMEVAPHLKETLNKNFFDFLLTFKQIHGDTVQN 

QLVVQGVELPSYLPLYPPAMDWIFQCISYHAPEA 

LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 

RSMDFIGMIKECDESGFPKHLLFRSLGLNLALAD 

PPESDRLQILNEAWKVITKLKNPQDYINCAEVWV 

EYTCKHFTKREVNTVLADVIKHMTPDRAFEDSY 

PQLQLIIKKVIAHFHDFSVLFSVEKFLPFLDMFQK 

ESVRVEVCKC1\RTPLSSINKSPPRTRSS*MPFCMF 

ARPCMTL/CNALTLEDEKRMLSYLINGnKMVSF 

GRDFEQQLSFYVESRSMFCNLEPVLVQLIHSVNR 

LAMETRKVMKGNHSRKTAAFVRSWGAYWFITIP 

SLAGIFTRLNLYLHSG 


3720 


A 


24 


296 


ENLFRAGFAFSLLRSSFYISKTYCSWFSNLISGSL 

ADFNSKGTRDYSPRQMAVRE/KVFDVIIRCFKRH 

GAEVIDTPVFELKVRNGQEETTW 


3721 


A 


2 


310 


PSCLTCVGHCSIGGSCTMIGIMMPECHCSLHMTG 
PRCEEHVFILQQPGHIASILIPLLVLLLLALVAGVV 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 

K 


3722 


A 


75 


722 


MELVAGCYEQVLFGFAVHPEPEACGDHEQWTL 

VADFTHHAHTASLSAVAVNSRFVVTGSKDETIHI 

YDMKKKIEHGALVHHSGTITCLKFYGNRHLISGA 

EDGLICIWDAKKWECLKSIKAHKGQVTFLSIHPS 

GKLALSVGTDKTLRTWNLVEGRSAFIKNIKQNA 

HIVEWSPRGEQYVVnQNKIDIYQLDTASISGTITN 

EKRISSVKFLSES 


3723 


A 


110 


316 


MELSDNRRSGGLEGLAEKCPNLTYLNLSGNKIK 
DLSTVEALVSGTVLSLDLLFLVKFSEICLCLLISI 


3724 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 



417 



wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=G!ycine, H=Histidine, 
I^lsoleucine, K=Lysine, l>Leudne, M^Methlonine, 
N«Asparagine» P=ProIine, Q=Glutamine, R^Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X==Un known, *=^top codon, /=pos5ible nucleotide deletion, 
\=possible nucleotide insertion 










VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DG 


3725 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 

VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 

FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 

GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 

DG 


3726 


A 


1 


433 


SSDDRSLFRRLKLNYAIFDEGHMLKNMGSIRYQ 
HLMTINANNRLLLTGTPVQIWLLELMSLLNFVM 

PHMFSSSTSEIRRMFSSKTKSADEQSIYEKERIAH 
AKQIIKPFILRRVKEEVLKQLPPKKDRIELCAMSE 

KQEQLYLG 


3727 


A 


6 


383 


RJPRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRIDLDELMKKDEPPLDFPDTLEGFEY 
AFNEKGQLRHIKTGEPFVFNYREHLHRWNQKRY 
EALGEIITKYVYELLEKDCNSICKVS 


3728 


A 


3 


.2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSA^TPERLVRSRSSNDIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLTV 

HSTKNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEfflQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3729 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYl 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSA'TPERLVRSRSSVDIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Inrntinn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E>=Glutamic Acid, F=PhenyIalanjne, G=Glyclne, H^Histidine, 
Msoleucine, K=Lysine, Lr^Leudne, M^Methionine, 
N=*AsDara?ine. P^Prnline. 0=Glutamine RBAroinin«> fisSpHne 
T=Threonine, V=Valine, W=Tryptophan, y=Tyrosine, 
Xs=UDknown, *=Stop codon, /^possible nudeotide deletion, 
V°po$sible nucleotide insertion 










EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLDCANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3730 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSA'TPERLVRSRSS\D1VSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAffiRSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKWTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKA>IPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3731 


A 


1 


1305 


VNTAMHEAKLIVIEECDELVEIIQQRKQMIAVKIK 

ETKVMKLRKLAQQVANCRQCLERSTVLINQAEH 

ILKENDQARFLQSAKNIAERVAMATASSQVLIPDI 

NFNDAFENFALDFSREKKLLEGLDYLTAPNPPSIR 

EELCTASHDTITVHWISDDEFSISSYELQYTIFTGQ 

ANFISLYNSVDSWMIVPNIKQNHYTVHGLQSGTR 

YIFIVKAINQAGSRNSEPTRLKTNSQPFKLDPKMT 

HKKLKISNDGLQMEKDESSLKKSHTPERFSGTGC 

YVYGVLHNSDNS*MFISLSFPLSHRYAIGIAYKSA 

PKNEWIGKNASSWVFSRCNSNFWRHNNKEML 

VDVPPHLKRLGVLLDYDNY/NMLSFYDPANSL\H 

LHTFDVTFULPVCPTFTIWNKSLMILSGLPAPDFI 

DYPERQECNCRPQESPYVSGMKTCH 


3732 


A 


127 


2832 


LGQRLSLVPRPSLKRRLGKRLSLGLRERMMSLW 
WS/GPKVRTQATTGARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cys(eine, D=Aspartic Acid, 
£=Glataniic Acid, F=Phenylalanine, G=Glycine, H=Hi$tidine, 
I^Isolencine, K^Lysine, L^Leucine, M=Metliionine, 
N^Asparagine, P=Proline, Q^GIutamine, R^Arginine, S">Scrioc, 
T=Threonine, V=Vaiine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=posslble nucleotide deletion, 
\=possible nucleotide insertion 










EFGTEAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACLWIEN*SMWM/PETFPGTQGQKGIQPWFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPREESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

ENTNNLFRPRVREEANIRSKLRTNREDCFESESED 

EFYKQSWVLPGEEANMDSGTETKKILILPWKLRA 

QKDVDSDRVKQEPRFEEEVIIGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AREEAKPESEEEAIFGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

LFHGVGFRSTSPFGIPEEASEMLEAKPKmELSPE 

GEEQESLLQPDQPSPEFTFQYDPSYRSVREIREHL 

RARESAESESWSCSCIQCELKIGSEEFEEFLLLMD 

KIRDPFIHEISKIAMGMRSASQFTRDFIRDSGVVS 

LIETLLNYPSSRVRTSFLENMIHMAPPYPNLNMIE 

TFICQVCEETLAHSVDSLEQLTGNKGCFRHLTMT 

IDYHT\LIAN»YGPGFPLLF*PQAQCGETKFHVLK 

MLLNLSENPAVAKKLFSAKALSIFVGLFNIEETN 

DNIQIVIKMFQNISNIIKSGKMSLIDDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFVGKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVGARTADGIPEGW 


3733 


A 


2 


.3274 


DVPLIRIEEDTGEIFTTGARIDREKLCAGIPRDEHC 

FYEVEVAILPDEIFRLVKIRJFLIEDINDNAPLFPAT 

VINISIPENSAINSKYTLPAAVDPDVGINGVQNYE 

LIKSQNIFGLDVIETPGGDKMPQLIVQKELDREEK 

DTYVMKVKVEDGGFPQRSSTAILQVSVTDTNDN 

HPVFKETEffiVSIPENAPVGTSVTQLHATDADIGE 

NAKIHFSFSNLVSNIARRLFHLNATTGLITIKEPLD 

REETPNHKLLVLASDGGLMPARAMVLVNVTDV 

NDNVPSroiRYIVNPVNDTVVLSENIPLNTKIALIT 

VTDKDADHNGRVTCFTDHEIPFRLRPVFSNQFLL 

ETAAYLDYESTKEYAIKLLA\ADAGKPPLNQSAM 

LFIKVKDENDNAPVFTQSFVTVSIPENNSPGIQLT 

KVSAMDADSGPNAKINYLLGPDAPPEFSLDCRT 

GMLTVVKKLDREKEDKYLFTILAKDNGVPPLTS 

NVTVFVSIIDQNDNSPVFTHNEYNFYVPENLPRH 

GTVGLITVTDPDYGDNSAVTLSILDENDDFTIDSQ 

TGVIRPNISFDREKQESYTFYVKAEDGGRVSRSSS 

AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 

PGTVVFQVIAVDNDTGMNAEVRYSIVGGNTRDL 

FAIDQETGNITLMEKCDVTDLGLHRVLVKANDL 

GQPDSLFSVVIVNLFVNESVTNATLINELVPQKH 

LKHQ*PQILEIADVSSPTSDYVKILVAAVAGTITV 

WVIFITAVVRCRQAPHLKAAQKNMQNSEWATP 

NPENRQMIMMKKKKKKKKHSPKNLLLNVVTffiE 

TKADDVDSDGNRVTLDLPIDLEEQTMGKYNWV 

TTPTTFKPDSPDLARHYKSASPQPAFQIQPETPLN 

LKHHnQELPLDNTFVACDSISNCSSSSSDPYSVSD 

CGYPVTTFEVPVSVHTRPPVDLEVGGAQSGQVAI 

LTSSLMELLLCLMVAAFLPLELRPLGQQNVMSW 

EQEAKBLLVGYWGDGEWCHFHFHHLIPGPVNPG 

YERKQYHILDSDSEDTQPSGELCPIPVRPFTILSIQ 

LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanlne C^Cystcine, IX^Aspartic Acid, , 
£>=Glutaniic Acid, F-Phenylalanine, G-Glycine, H'^Histidine, 
l-lsoleucine, K=Lysinc, L^Leucine, M^Methionine, 
N^Asparagine, P=Proline, Q^GIutaminC) R^Arginine, S^^erinCy 
T=Threonin e, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 


3734 


A 


1 


840 


GTRPGHLPAPSDGFCV/HL*SIPSWGSF*GESL/EM 

QLITSLGLQEFDIARNVLELIYAQTLVWIGIFFCPL 

LPFIQMIMLFIMFYSKMSLMMNFQPPSKAWRAS 

QMMTFFIFLLFFPSFTGVLCTLAITIWRLKPSADC 

GPFRGLPLFfflSIYSWIDTLSTRPGYLWVVWIYKN 

LIGSVHFFFILTLIVLIITYLYWQITEGRKIMB^ 

EQHhffiGKDKMFLffiKLIKLQDMEKKANPSSLVLE 

RREVEQQGFLHLGEHDGSLDLRSRRSVQEGNPR 

A 


3735 


A 


2 


432 


VEVCRRYLWKMTVDASQNVQCCVIFSHFPFIFN 

NLSKIKLLHTDTLLKIESKKHKAYLRSAAIEEERE 

SEFALRPTFDLTVRRNHLIEDVLNQLSQFENEDL 

RKELWVSFSGEIGYDLGGSAraGEIFYCLFAEMIQ 

PEYGMFMY 


3736 


A 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKADKV 

mLWNKKATAVLVIASTDVDKTGASYYGEQTL 

HYIATNGESAVVQLPKNGPIYDWWNSSSTEFCA 

VYGFMPAKATIFNLKCDPVFDFGTGPRNAAYYS 

PHGHILVLAGFGNLILQI*AD/IMKVWNVKNYKLI 

SKPVASDSTYFAWCPDGEHILTATCAPRLRVNN 

GYKIWHYTGSILHKYDVPSNAELWQVSWQPFLD 

GIFPAKTITYQAVPSEVPNEEPKVATAYRPPALRN 

KPITNSKLHEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NTVSQSISGDPEIDKKIKNLKKKLKAIEQLKEQAA 

TGKQLEKNQLEKIQKETALLQELEDLELGI 


3737 


A 


3190 


664 


. VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 

EILPEPGSETPTVASEALAELLHGALLRRGPEMG 

YLPGPPLGPEGGEEETTTTIITTTTVTTTVTSPVLC 

NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 

YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGWLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVLGSGVYIYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3738 


A 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEEITI IIITTTTVTTTVTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 



421 



wo 01/57190 PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Incntinn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide , 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Hi$tidine, 
I^Isoleucine, K=Lysine, L=Leucinc, M=Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R=Arginine, S^erinC} 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=iInkno\vn, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DJLTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVLGSGVYIYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3739 


A 


734 


445 


LLEPEPAEEYTEQSEVEST/EGMILI*CCLYFAAFQ 
Thr/SNIYFALQYV]SnR.QFMAETQFTSGEKEQVDE 
WTVETVEVRVLCIAKLLSLSSVSNFYLY 


3740 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKJIQVGVVQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

VNKKVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKKIWDLWRILTIDG/* 

PQIAVTLNGVDKDLLFTTTSVINGSQVVTFANPQV 

KTLFDEGWHQIKLLVTEQDVTLYIDDQQIENKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL- 

PGYKGEPGRDGDK 


3741 


A 


5048 


1236 


MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYSALYTVPT 

QNVTPNTVNQQPGAQQLYSRGPPAPHIVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

LPAGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDl^IRNHTGSLAVANNNPTITVADSLSCPVM 

QNVQPPKSSPWSTVLSGSSGSSSTRTPPTANHPV 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 

PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 

YDALEGGSYPDMLSSSASSPAPDPAPEPDPASAP 

APASAPAPVVPQPSKMAKPLAMAIQHFSLVIRML 

QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 
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SEQ ID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
Iocs ti on 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^'Alanine C=Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F==PhenyIalanine, G=Glycine, H^Histidine, 
Msoleucine, K=Lysine, L=Lcucine, M^Methionine, 
N^Asparagine, P=Proline, (^GlutaminC) R=Arginine9 S=Serine, 
T=Thrconine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 










IGGLSLQSSPQPESLRPVNLTQERNILPMTPVWAP 
VPNLNADLKKLNCSPDSFRCTLTNIPQTQALLNK 
AKLPLGLLLHPFRDLTQLPVITSNTIVRCRSCRTYI 
NP\FVSFIDQRR*KCNLCYR\^VPEEFMYNPLT 
RSYGEPHKRPEVQNS\TVEF1ASSDYMLRPPQPAV 
YLFVLDVSHNAVEAGYLTI/LWCQSLLEVNLDKLP 
G\DSRT\RIGFMTFD\STYSFLQFTQEGLSQPQMLI 
VSDIDDVFLPTPDSLLVNLYESKELIKDLLNALPN 
MFTNTRETHSALGPALQAAFKLMSPTGGRVSVF 
QTQLPSLGAGLLQSREDPNQRSSTKVVQHLGPAT 
DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 
CMSKYSAGCIYYYPSFHYTHNPSQAEKLQKDLK 
RYLTRKIGFEAVMRIRCTKGLSMHTFHGNFFVRS 
. TDLLSLANINPDAGFAVQLSIEESLTDTSL VCFQT 
ALLYTSSKGERRJRVHTLCLPWSSLSDVYAGVD 
VQAAICLLANMAVDRSVSSSLSDARDALVNAW 
DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 
LKQKAFRTGTSTRLDDRVYAMCQIKSQPLVHLM 
KNflHPNLYRIDRLTDEGAVHVNDRTVPQPPLQKL 
SAEKLTREGAFLMDCGSVFYIWVGKGCDNNFIE 
DVLGYTNFASIPQKMTHLPELDTLSSERARSFIT 
WLRDSRPLSPILfflVKDESPAKAEFFQHLIEDRTE 
AAFSYYEFLLHVQQQICK 


3742 


A 


934 


68 


SMLASQGVLLHPYGVPMIVPAAPYLPGLIQGNQE 

AAAAPDTMAQPYASAQFAPPQNGIPAEYTAPHP 

HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 

SAQTVSGTRNKQD*RSTDGWPSPKTQTS*KHGK 

QVSSPSGLHVSNIPFR\FRDPDLRQMF\GQFGKILD 

VEIIFNERGSKGFGFVTFENSADADRAREK\LHGT 

VV\EGRKI\EVN\NATARVMTNKKTVNPYTNGWK 

LNPVVGAVYSPEFYAGTVLLCQANQEGSSMYSA 

PSTDFRGAKLHTSRPLLSGS 


3743 


A 


3 


1456 


QFQQAWMQNKVPIPAPNEVLInFDRKEDIKLEEKK 

KTQAEIEQEMATLQYTNPQLLEQLKIERLAQKQV 

EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 

PSISADANEHGS\KGPPGPQGQFRPPGPQGQMGP 

QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 

MHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 

HMGPQGPPGPQGHIGPQGPPGPQGHLGPQGPPGT 

QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 

VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 

MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 

MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL 

GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 

QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 

GAQGRIPPLNPGQGPGPNKVS/ERGAPPRHEGRA 

PPRGRDGFPGPMKTLV 


3744 


A 


1571 


652 


PLTGRKCPGWTHSGSRRSPRIAEEVPGFPKRAEA 

SRQFSETADIU.ELLRRAV]VL\AARATTPADGEEP 

APEAEALAAARERSSRFLSGLELVKQGAEARVFR 

GRFQGIL^VIKHRFPKGYRHPALEARLGRRRTV 

QEARALLRCRRAGISAPVVFFVDYASNCLYMEEI 

EGSVT\nm\IFSPLWRLKKTPQGLSNLAKTIGQVL 

ARMHDEDLfflGDLTTSNMLLKPPLEQLNIVLIDF 

GLSFISALPEDKGVDLYVLEKAFLSTHPNTETVFE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
locatioti . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Pbeoylalanine, G^GIycine, H=Histidine, 
I==Isoleucine, K^Lysine, L^Leucine, M'^Methionine, 
N»Asparagine, P^Froline, Q=Glutaminc, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *»Stop codon, /^possible nucleotide deletion, 

V^=nr^0o«l«lA manlAA^^A ItlCAiP^tAtl 

v^possiDie nucicuuac luscriion 










AFLKSYSTSSKKARPVLKKLDEVRLRGKKRSMV 
G 


3745 


A 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLLWLA 

LACSPVHTTLSKSDAKKAASKTLLEKSQFSDKPV 

QDRGLVVTDLKAESVVLEHRSYCSAKARDRHFA 

GDVLGYVTPWNSHGYDVTKVFGSKFTQISPVWL 

QLKRRGREMFEVTGLHDVDQGWMRAVRKHAK 

GL\P*CLGSCLRTGLTMISG/YVLDSEDEffiELSKT 

VVQVAKNQHFDGFWEVWNQLLSQKRVGLIHM 

LTHLAEALHQARLLALLVIPPAITPG'IDQLGMFT 

HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 

SWVRACVQVLDPKSKWRSKELLGLNFYGMDYA 

TSKDAREPVVGARYIQTLKDHRPRMVWDSQVSE 

HFFEYKKSRSGRHVVFYPTLiCSLQVRLELARELG 

VGVSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 

PWSE 


3746 


A 


1 


898 


IDRAAECRTKPLPMAVSIRGNADSr/ACLVLMVL 

YLIKKRLVACAAVFYGFAVHMKTYPETYILPITL 

HLLPDRDNDKSLRQFRYTFQACL*ELLKRLCmT 

ALMFVAVAGLTFFALSFGFYYEYGWEFLEHTYF 

YHLTRRDIRHNFSPYFYMLYLTAESKWSFSLGIA 

AFLPQLILLSAVSFAYYRDLVFCWFLHTSIFVTFN 

KVCTSQYFLWYLCLIJPLVMPLVR^dPWKRAVVL 

LMLWFIGQAMWLAPAYVLEFQGKNTFLFIWLA 

GLFFLLINCSILIQIISHYKEEPLTERIKYD 


3747 


A 


1 


2325 


MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 

YRKVMLENYRNLASLGLCVSKPDVISSLEQGKEP 

WTVKRKMTRAWCPDLKAVWKIKELPLKKDFCE 

GKLSQAVITERLTSYNLEYSLLGEHWDYDALFET 

QPGLVTIKNLAVDFRQQLHPAQKNFCKNGIWEN 

NSDLGSAGHCVAKPDLVSLLEQEKEPWMVKREL 

TGSLFSGQRSVHETQELFPKQDSYAEGVTDRTSN 

TKLDCSSFRENWDSDYVFGRKLAVGQETQFRQE 

PITHNKTLSKERERTYNKSGRWFYLDDSEEKVH 

NRDSIKNFQKSSVVIKQTGIYAGKKLFKCNECKK 

TFTQSSSLTVHQRIHTGEKPYKCNECGKAFSDGS 

SFARHQRCHTGKKPYECIECGKAFIQNTSLIRHW 

RYYHTGEKPFDClDCGKAFSDfflGLNQHRRIHTG 

EKPYKCDVCHKSF\RYGSSLTVHQRIHTGEKPYE 

CDVCRKAFSHHASL'HQNHQRVHSGEKPFKCKEC 

GKAFRQNIHLASHLRIHTGEKPFECAECGKSFSIS 

SQLATHQRIHTGEKPYECKVCSKAFTQKAHLAQ 

HQKTHTGEKPYECKECGKAFSQTTHLIQHQRVH 

TGEKPYKCMECGKAFGDNSSCTQHQRLHTGQRP 

YECIECGKAFKTKSSLICHRRSHTGEKPYECSVC 

GKAFSHRQSLSVHQRIHSGKKPYECKECRKTFIQI 

GHLNQHKRVHTGERSYNYKKSRKVFRQTAHLA 

HHQRMTGESSTCPSLPSTSNPVDLFPKFLWNPSS 

LPSP 


3748 


A 


823 


1 


GGYTKSGYDSACKDFVPHDLEVQIPGRVFLVTG 

GNSGIGKATALEIAKRGGTVHLVCRDQAPAEDA 

RGEIIRE\SGNQNIFLHIVDLSDPKKIWKFVENFKQ 

EHKLHVL\V^WAGC^^VNKREAHKKMDFEKNFG 

CQYSGVCTFLTTRPDPLCWRKNTDPRVmvSSG 

GMLVQKLNNQ*SPVRKNTIWMGTMVYAQNKVS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

n^ntide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G°=Glycine, H=Histidine, 
I=Isolcucine, K-Lysine, I^Leucine, M^^Metbionlne, 
l*=Asparagine, P=Proline, Q=Gtutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W-Ti^ptophan, Y=Ty rosine, 
X=Unknown, *=Slop codon, A°pos$iblc nucleotide deletion, 
\r'possible nucleotide insertiOD 










ERQQWL'I\ERWGPRAPG\IHFSSMHPGWA\DTPG 
VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 
SSARSRTAQRP 


3749 


A 


1939 


715 


GFLRLSQAT\RQRLSIPVMVLTLDPTRD\QCFGDR 

FSRLLLDEFLGYDDILVMSSVKGLAENEENKGFLR 

NWSGBHYRFVXSMWMAR-nSYLAAFANHGQSF 

TLSVSHACCGYSHHQIFVFIVDLLQMLEMNMAIA 

FPAAPLLTVILALVGMEAIMSEFFNDTTTAFVIILi 

VWLADQYDAICCHTSTSKRHWLRFFYLYHFAFY 

AYHYRFNGQYSSLALVTSWLFIQHSMIYFFHHYE 

LPAILQHVRIQ\EMLLQAPTLGPGTPTA\LPDDMN 

NNSGAPATAP\DSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAAIIT 

DASFLSGLSASLLERRPASPLGPAGGLPHAPQDS 

VPPSDSAASDTTPLGAAVGGPSPASMAPTEAPSE 

VGS 


3750 


A 


2 


844 


GLLEPFSKLLSFVIQNAVFTLAYLVELCGLCYRA 

FTKERDKFYLSRSWLELLQALKLKSPLPDTNLL 

LLVQFICADAGTKLAESTILSKQMIASVPGCGTA 

AMECVRQYINEVLDFMVADMHTLTKLKSHMKTC 

SQPLHEDTFGGHLKVGLAQIAAMDISRGNHRDN 

KAVIRYLPWLYHPPSAMQQGPKEFIECVSHIRLL 

SWLLLGSLTHNAVC/LKWPPLPGLPIPLDAGSHV 

ADHLrVILIGFPEQSKTSVL\HMCSLFHAF\SLAQL 

WDSLLARQSGRW 


3751 


A • 


431 


2 


AFTRKCEETAFIVPQCEIIPTE/WVCRRIPTGSSLER 

NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 

QLIAAKFGFAALGI/QTEVDIMSHAT*AVFEIPEKS 

RL\PQNCTPVDMKIEFGVHVTSKEILTDVIDNDS* 

RHSPS 


3752 


A 


131 


1278 


AWSGSGLLVLCINTASMPMISVLGKMFLWQREG 

PGGRWTCQTSRRVSSDPAWAVEWIELPRGLSLSS 

LGSARTLRGWSRSSRPSSVDSQDLPEVNVGDTV 

AMLPKSRRALTIQEIAALARSSLHGISQWKDHV 

TKPTAMAQGRVAHLIEWKGWSKPSDSPAALESA 

FSSYSDLSEGEQEARFAAGVAEQFAIAEAKLRA 

WSSVDGEDSTDDSYDEDFAGGMDTDMAGQLPL 

GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP 

DTLCSSLCSLEDGLLGSPARLA\PSCWAMSCFSPN 

CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 

AEEEPAPCKDCQPLCPPLTGSWERQRQASDLASS 

GVVSLDEDEAEPEEQ 


3753 


A 


3 


1138 


YYSSVRQRVTCEEPRFRECAAALIEGSATEVYAG 

EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 

YGRTTRPDGSREEGKYKKNRLVHGGRVRSLLPL 

ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 

AARAADALLKAVAASSVAEKAVEAARMAKLIA 

QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 

YENGLTPSEGSPELPSSPASSRQPWRPPACRSPLP 

PGGDQGPFSSPKAWPEEWGGAGAQAEELAGYE 

AEDEAGMQGPGPRDGSPLLGGCSDSSGSLREEE 

GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 

AGCLTEELGEPAATERPAQPGAANPLWGAVAL 

LDLSLAFLFSQLLT 


3754 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
£=G]utamic Acid, F=PhenylaIanioe, G^'Glycine, H^Histidine, 
I=l5oleucine, K=Lysine, L=Leucine, M^Mcthionine, 
N=Asparagine, P=ProHne, Q=Glutamine, R=Arginine, S=*Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc* 
X=»Unknown, *=Stop codon, A»possibIe nucleotide deletion, 
\=possible nucleotide insertion 










EDSERKVVKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGHFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALVVLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQVEAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 

EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGVVAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLIEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3755 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 

EDSERKWKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYUCHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALVVLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

Incatinn 

1 LIU II 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>=Aspartic Acid» 
E^Glutamic Acid, F«Phcny!alanine, G=Glycine, H^Histidine, 
I^Isoleucine, K^Lysine, L^Leucine, M-IVIethionine, 
N^Asparagine, P^Proline, Q^GlutaminC) R^Arginine^ S^^eriney 
T^hreonine, V=Valine, W=Tryptophan, Y=Tyrosiiiel 
X=Unknown, *°°Stop codon, ^possible nudeotide deletion, 
V=possible nudeotide insertion 










EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLffiPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3756 


A 


112 


1361 


SLEEQQGRHPSFAPKCASQILGRIMITLITEQLQK 

QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 

GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 

TTVHRGNSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 

KRHCRSLSVPVDLSRWQPVWRPAPSKLWTPIKH 

RGSGGGGGPQVPHQSPPKRVSSL/SVPPSSQCLFS 

MCPSSHTLQPSFLQPGPGPVDSSRPCAASPQSGSW 

ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 

PASSPELPWRPRGLRNLPRSRSOPCDLDARKTGV 

KRRHEEDPRRLRPSLDFDKMNQKPYSGGLCLQE 

TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 

SESEEEEEGAVRWGRQALSKRTLCQRDFGDLDL 

NLIEEN 


3757 


A 


413 


1 


PKPMLQQDFT/SLPDQGLDHIAE/NSYFDARSLCA 
AELVCKEWQQVTSE*MLWKKLIERMVHAYPLW 
KGLSEKVW/DQHLFKNRPTDGPPNSFHRSLYPKII 
QV1ETIESNWQCG*HTLQRIQCHSEKSKGVYCLQ 
YDDEK 


3758 


A 


2 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQEM 

TRRPSLMAGRQHGWSAQQSATVANPVPGANPD 

LLPHFLGEPEDVYIVKNKPVLLVCKAVPATQIFF 

KCNGEWVRQVDHVIERSTDGSSGLPTMEVRINV 

SRQQVEKVFGLEEYWCQCVAWSSSGTTKSQKA 

YIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 

PPAE 


3759 


A 


1 


561 


ADDTLHLWNLRQKRPAILHSLKFCRERVTFCHLP 

FQSKWLYVGTERGNIHIVNVESFTLSGYVIMWN 

KAIELSSKSHPGPWfflSDNPMDEGKLLIGFESGT 

VVLWDLKSKKADYRYTYDEAIHSVAWHHEGKQ 

nCSHSDGTLTIWNVRSPAKPVQTITPHGKQLKD 

GKKPEPCKPILKVEFXTTR 


3760 


A 


1 


824 


LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 

UCHLRTLLSPQDGAAKVTCMAWSQNNAKFAVC 

TVDRWLLYDEHGERRDKFSTKPADMKYGRXS 

YMVKGMAFSPDSTKIAIGQTDNHYVYKIGEDWG 

DKKVICNKFIQTVKFRP VPGTLG*TNIYQYI YL* IQ 

PGVAFLTSECDFSYCKDGASWLFMVICCLP*SPA 

VSFPIGD*\SAVTCLQWPAEYnVFGLAEGKVRLS 

NTKTNKSSHYGTESYWSLTTNCSGKGILSGHA . 

DGYQR 


3761 


A 


2253 


320 


PVIQRCSQPYGFSLLISFFLKCVSETSQQPPSRKVF 
QLLPSFPTLTRSKSHESQLGNRIDDVSSMRFDLSH 
GSPQMVRRDIGLSVTHRFSTKSWLSQVCHVCQK 
SMFGVKCKHCRLKCHNKCTKEAPACRISFLPLT 
RLRRTESVPSDINNPVDRAAEPHFGTLPKALTKK 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location ■ 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^^Aspartic Acid, 
E<=Glutamic Acid, F»Phenyialanine, G=Glycine, H=Histid!ne, 
I~l5oleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProUne, Q^GIutamine, R^Arginine, S^erinC) 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X==Unknown» *=Stop codon, /==possible nucleotide deletion, 
V=possible nucleotide insertion 










EHPPAMNHLDSSSNPSSTTFSTPSSPAPFPTSSNPS 

SATTPP\OTSP\GQR\DSRFNFPSC/AWIHHR\Q\QFI 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDIPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAIRLLEMDGHNQDHLKLFKK 

EVMNYRQTRHENVVLFMGACMNPPHLAIITSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEinCGMGYL 

HAKGIVHKDLKSRNVFYDNG\KWITDFGLF\GIS 

GWP\EGRRENQLKLSHDWLCYLAPEIVREMTPG 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 
LSACWAFDLQERPSXFSLLMDMLEKLPKLNRRLS 
HPGHF*KSADINSSKVVPRFERFGLGVLESSNPK 
M 


3762 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVIGCWLVNITKNF 

DIGPKFIQVGVVQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAVVLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

VNKKVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKKIWDLWRILTIDG/* 

PQIAVTLNGVDKILLFTTTSVINGSQWTFANPQV 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 
PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 
TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL 
PGYKGEPGRDGDK 


3763 


A 


3 


1267 


CKVWRNPLNLFRGAEYNRYTWVTGREPLTYYD 

MNLSAQDHQTFFTCDSDHLRPADAIMQKAWRE 

RNPQARISAAHEALEINECATAYILLAEEEATTIA 

EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 

QHSVLYLPLQXTRHQCLGVHQKKASNVCQKTOE 

DQGSSENDERFNEGVPPSEYVQYP*KPRKALLEL 

QAYADVQAVLAKYDDISLPKSATICYTAALLKA 

RAVSDKFSPEAASRRGLSTAEMNAVEAIHRAVEF 

NPHVPKVLI PMKSLILPPFHTLTfRGDSFAIAYAFF 

HLAHWKRVEGALNLLHCTWEGTFRMIPYPLEKG 

HLFYPYPICTETADRELLPSFHEVSVYPKKELPFFI 

LFTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGIKAA 


3764 


A 


25 


1032 


RSADGLCGNKDRERGNEFTRNQQAAQEVVNPK 

KKMKKKKYVNSGTVTLLSFAVESECTFLDYIKG 

GTQINFTVABDFTASNGNPSQSTSLHYMSPYQLN 

AYALALTAVGEIIQHYDSDKMFPALGFGAKLPPD 

GRVSHEFPLNGNQENPSCCGIDGILEAYHRSLRT 

LIITDGVISDMAQTKEAIVNG\SKLPMSIIIVGVGQ 
AEFNAMVELDGDDVRISSRGKLAERDIVQFVPFR 
DYVDRTGNHVLSMARLARDVLAEIPDQLVSYM 
KAQGIRPRSPPAAPTHSPSQSPARTPPACPLHTHI 


3765 


A 


172 


3456 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQID 
NO: 


Method 


Predicted 
beginning 
Ducieofide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E-Glutaraic Acid, F-Phenylalanine, G=Glycine, H^Histidine, 
i=Iso]eucine, K-Lysine, L>=Leucine, M=Metbionine, 
N^AsparaginC) P=Proline, Q^=Glutaniine, R^Arginine, S^eriDC) 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosinc, 
X=lJnknown, *=Stop codon, /^possible nucleotide deletion, 
V=possib]e nucleotide insertion 










KNFDSAKVPSDEYCPACKEKGKLKALKTYRISFQ 

ESIFLCEDLQCIYPLGSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYKDSLLLANSKKTRNYIAIDGGKV 

LNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLE 

RMEn.EADTVDMATTKDPATVDVSGTGRPSPQN 

EGCTSKLEMPLESKCTSFPQALCVQWKNAYALC 

WLDCILSALVHSEELKNTVTGLCSKEESIFWRLL 

TKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEI 

ETCLNEVRDEIFISLQPQLRCTLGDMESPVFAFPL 

LLKLETHIEKLFLYSFSWDFECSQCGHQYQNRH 

MKSLVTFmWEWHPLNAAHFGPCNNCNSKSQI 

RKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHFE 

GCLYQITSVIQYRANNHFITWILDADGSWLECDD 

LKGPCSERHKKFEVPASEIHIVIWERKISQVTDKE 

AACLPLKKTNDQHALSNEKPVSLTSCSVGDAAS 

AETASVTHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LEN 

KPVAENTGILKTNTLLSQESLMASSVSAPCNEKLI 

QDQFVDISFPSQVVNTNMQSVQLNTEDTVNTKS 

VNNTDATGLIQGVKSVEIEKDAQLKQFLTPKtEQ 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RNTITDLQPSVKGVhfNFGGFKTKGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQLNHSSYGNGISSANHEDLV 

EGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCESIEDLLNELPYPIDIA 

NESACTTVPGVSLYSSQTHEEILAELLSPTPVSTE 

LSENGEGDFRYLGMGDSHIPPPVPSEFNDVSQNT 

HLRQDHbTK^CSPTKKNPCEVQPDSLTNNACVRTL 

NLESPMKTDIFDEFFSSSALNALANDTLDLPHFDE 

YLFENY 


3766 


A 


3 


1622 


AQQIVYRNVMLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

FKDKQSCDIKMEGMAR>JDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 

ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSFA^HSSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3767 


A 


3 


1622 


AQQP/YRNVMLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

FKI)KQSCDIKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotlde 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
]=Iso]eucine, K^Lysine, I^Leudne, M^^Methionine, 
N=Asparagine, P^^ProUne, Q^^GIutaniine, R=Ai^inine, S==Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X==linkno>vn, *^top codon,/«possible nucleotide deletion, 
V^possible nucleotide insertion 










ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSFA^HSSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRfflTGEKPYECCQCGKAFIRKNfDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3768 


A 


185 


2258 


SIIIKMSRKISBCESKKVNISSSLESEDISLETTVPTD 

DISSSEEREGKVRITRQLIERKELLHNIQLLKIELS 

QKTMMIDNLKVDYLTKIEELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETILLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYIK 

LKAFPEDQLSIPEYVSVRFYELVNPLRKEICELQV 

KKNILAEELSTNKNQLKQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQLIQQGDYRQENYDKVKS 

ERDALEQEVIELRRKHEB.EASHMIQTKERSELSK 

EWTLEQTVTLLQKDBCEYLNRQNMELSVRCAHE 

EDRLERLQAQLEESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQEIDQLRNASREMY 

ERENRNLREARDNAVAEKERAVMAEKDALEKH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

RVQLLQEETARNLTQCQLECEKYQKKLEVLTKE 

FYSLQASSEKRITELQAQNSEHQARLDIYEKLEK 

ELDEIIMQTAEIENEDEAERVLFSYGYGANVPTT 

AKRRLKQSVHLARRVLQLEKQNSLI/LKRSGTSK 

GPSNTAFTRSLTEANSLLNQTQQPYRYLIESVRQ 

RDSKIDSLTESIAQL/ERKDVSNLNKEKSALLQTN 

GIKMAL\DL\DQLLNHP 


3769 


A 


3 


2297 


DAAEFRVVADAMKVIGFKPEEIQTVYKILAAILH 

LGNLKFVVDGDTPLIENGKVVSnAELLSTKTDM 

VEKALLYRTVATGRDIIDKQHTEQEASYGRDAF 

AKAIYERLFCWIVTRINDIIEVKNYDTTIHGKNTV 

IGVLDIYGFEIFDNNSFEQFCINYCNEKLQQLFIQL 

VLKQEQEEYQREGIPWKHIDYFNNQIIVDLVEQQ 

HKGIIAILDDACMNVGKVTDEMFLEALNSKLGK 

HAHFSSRKLCASDKILEFDRDFRJRHYAGDVVYS 

VIGFIDKNKDTLFQDFKRLMYNSSNPVLKNMWP 

EGKLSITEVTKRPLTAATLFKNSMIALVDNLASK 

EPYYVRCDCPNDKKSPQIFDDERCRHQVEYLGLL 

ENVRVRRAGFAFRQTYEKFLHRYKMISEFTWPN 

HDLPSDKEAVKKLIERCGFQDDVAYGKTKIFIRT 

PRTLFTLEELRAQMLIRIVLFLQKVWRGTLARMR 

YKRTKAALTIIRYYRRYKVKSYIHEVARRFHGVK 

TIVIRDYGKHVKWPSPPKVLRRFEEALQTIFNRWR 

ASQLIKSIPASDLPQVRAKVAAVEMLKGQRADL 

GLQRAWEGNYLASKPDTPQTSGTFVPVANELKR 

KDKYMNVLFSCHVRKVNRFSKVEDRAIFVTDRH 

LYKMDPTKQYKVMKTIPLYNLTGLSVSNGKDQL 

VVFHTKDNKDLIVCLFSKQPTHESRIGELWGVLV 

NHFKSEKRHLQVXNVTNPVQCSLHGKKCTVSVE 

TRLNQPQPDFTKNRSGFILSVPGN 


3770 


A 


3 


6276 


HKVAAPDVVVPTLDTVRHEALLYTWLAEHKPL 
VLCGPPGSGKTMTLFSALRALPDMEWGLNFSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D-Aspartic Acid, 
E^Glutamic Acid, F^Phenylalanine, G=^GIycine, U-Histidine, 
I=Isoleucine, K^Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P^ProIinc, Q=Glutamine, R=Afginine, S^Scrine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosinc, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










ATTPELLLKTFDHYCEYRRTPNGVVLAPVQLGK 

WLVLFCDEINLPDMDKYGTQRVISFIRQMVEHG 

GFYRTSDQTWVKLERIQFVGACNPPTDPGRKPLS 

HRFLRHVPVVYVDYPGPASLTQIYGTFNRAMLR 

LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 

YIYSPREMTRWVRGIFEALRPLETLPVEGLIRIWA 

HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 

NIDREKAMSRPILYSNWLSKDYIPVDQEELRDYV 

KARLKVFYEEELDVPLVLFNEVLDHVLRIDRIFR 

QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 

IKVHRKYTGEDFDEDLRTVLRRSGCKNEKIAFIM 

DESNVLDSGFLERMNTLLANGEVPGLFEGDEYA 

TLMTQCKEGAQKEGLMLDSHEELYKWFTSQVIR 

NLHVVFTMNPSSEGLKDRAATSPALFNRCVLNW 

FGDWSTEALYQVGKEFTSKMDLEKPNYIVPDYM 

PWYDKLPQPPSHREAIVNSCVFVHQTLHQANA 

RLAKRGGRTMAITPRHYLDFINHYANLFHEKRSE 

LEEQQMHLNVGLRKIKETVDQVEELRRDLRIKS 

QELEVKNAAAhTOKLKKMVKDQQEAEKKKVMS 

QEIQEQLHKQQEVIADKQMSVKEDLDKVEPAVI 

EAQNAVKSIKKQHLVEVRSMANPPAAVKLALES 

ICLLLGESTTDWKQIRSIIMRENFIPTIVNFSAEEIS 

DAIREKMKJCNYMSNPSY^rYEI^ 

KWAIAQLNYADMLKRVEPLRNELQKLEDDAKD 

NQQKANEVEQMIRDLEASIARYKEEYAVLISEAQ 

AIKADLAAVEAKVNRSTALLKSLSAERERWEKT 

SETFKNQMSTIAGDCLLSAAFIAYAGYFDQQMR 

QNLFTTWSHHLQQANIQFRTDIARTEYLSNADER 

LRWQASSLPADDLCTENAIMLKRFNRYPLIIDPS 

GQATEFIMNEYKDRKITRTSFLDDAFRKNLESAL 

RFGNPLLVQDVESYDPVLNPVLNREVRRTGGRV 

LITLGDQDIDLSPSFVIFLSTRDPTVEFPPDLCSRV 

TFVNFTVTRSSLQSQCLNEVLKAERPDVDEKRSD 

LLKLQGEFQLRLRQLEKSLLQALNEVKGRILDDD 

TIITTLENLKREAAEVTRKVEETDIVMQEVETVS 

QQYLPLSTACSSIYFTMESLKQIHFLYQYSLQFFL 

DIYHNVLYENPNLKGVTDHTQRLSHTKDLFQVA 

FNRVARGMLHQDHITFAMLLARIKLKGTVGEPT 

YDAEFQHFLRGNEIVLSAGSTPRIQGLTVEQAEA 

VVRLSCLPAFKDLIAKVQADEQFGIWLDSSSPEQ 

TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 

MAHMFVSTNLGESFMSIMEQPLDLTQIVGTEVKP 

NTPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 

GSAEGFNQADKAINTAVKSGRWVMLKNVHLAP 

GWLMQLEKKLHSLQPHACFRLFLTMEINPKVPV 

NLLRAGRIFVFEPPPGVKANMLRTFSSIPVSRICK 

SPNERARLYFLLAWFHAIIQERLRYAPLGWSKKY 

EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 

WSALKTLMAQSIYGGRVDNEFDQRLLNTFLERL 

FITRSFDSEFKLACKVDGHKDIQMPDGIRREEFV 

QWVELLPDTQTPSWLGLPNNAERVLLTTQGVD 

MISKMLKMQMLEDEDDLAYAETEKKTRTDSTS 

DGRPU.WMRTLHTTASNWLHLIPQTLSHLKRTVE 

NIKDPLFRFFE\REVKMGAKLLQ\DVRQDLADV\V 

QVCEGKKKQTNYLRTLI\NELV\KGILP\RSWSHY 
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SEQID 
NO: 


Method 


Predicted 

begiDDing 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nudeotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
jE>=G]utamic Acid, F=Phenylalanine, G=G]ycine^ H^Histidine, 
Msoleudne, K«Lysine, Ir=Leucine, M=MethioQine, 
N»Asparagine, P=Proline, Q=Glutamlne, R=ArginiDe, S^^Serine, 
T»Threonine, V=VaIinc, W=Tryptophan, Y=Tyrosine, 
X=Unl(nown, *=:Stop codon, /=pos5ible nucleotide deletion, 
\=possible nucleotide insertion 










TVPAG\MTVIQWGVPISARRI\KQLQNISL\AAASG 

GAKELKNIHVCLGGLFVPEAYITATRQYVAQAN 

SWSLEELCLEVNVTTSQGATLDACSFGVTGLKL 

QGATCNNNKLSLSNAISTALPLTQLRWVKQTNT 

EKKASVVTLPVYLNFTRADLIFTVDFEIATKEDPR 

SFYERGVAVLCTE 


3771 


A 


1 


2043 


LPLLHAGFNRRFMENSSnACYNELIQlEHGEVRS 

QFiaRACNSVFTALDHCHEAIEITSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TPVIGQGGKIRHFVSLKKLCCTTONNKQIHKIHR 

DSGDNSQTEPHSFRYKNRRKESIDVKSISSRGSDA 

PSLQNRRYPSMARfflSMTIEAPITKVINIINAAQEN 

SPVTVAEALDRVLEILRTTELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRRLSGNEYVFTKNVHQSHSH 

LAMPITINDVPPCISQLLDNEESWDFNIFELEAITH 

KRPLVYLGLKVFSRFGVCEFLNCSETTLRAWFQ 

VIEANYHSSNAYHNSTHAADVLHATAFFLGKER 

VKGSLDQLDEVAALIAATVHDVDHPGRTNSFL\C 

NAGSELAVLYND-nAVVLESHHTALAFQXLTVKDT 

K\CNIFKNID/RGNHYRTLRQAIIDMVLATEMTKH 

FEHVNKFVNSINKPMAAEIEGSDCECNPAGKNFP 

ENQILIKRMMIKCADVANPCRPLDLCIEWAGRIS 

EEYFAQTDEEKRQGLPVVMPVFDRNTCSIPKSQI 

SFIDYFITDMFDAWDAFAHLPALMQHLADNYKH 

WKTLDDLKCKSLRLPSDRLKPSHRGGLLTDKGH 

CESQ 


3172 


A 


1013 


50 


TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 

HELIKEAEIIQGIMALLTRTLEEASEQIRMNRSAK 

YNLEKDLKDKFVALTIDDICFSLNNNSPNIRYSEN 

AVRIEPNSVSLEDWLDFSSTNVEKADKQRNNSL 

MLKALVD\RILSQTANYLRKQCDWHTAFKNGL 

KDTKDARDQLADHLAKWMBEIASQEKNITALEK 

AILDQEGPAKVAHTRLETRTHRPNVELCRDVAQ 

YRLMKEVQEITHNVARLKETLA\QAQAELKGLH 

RRQLALQEEIQVKENTIYIDEVLCMQMRKSIPLR 

DGEDHGVWAGGLRPDAVC 


3773 


A 


1 


955 


AAARESERQLRLRLCVLNEILGTERDYVGTLRFL 

QSAFLHRIRQNVADSVEKGLTEENVKVLFSNIEDI 

LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 

DKFCVYEEYCSNHEKALRLLVELNKIPTVRAFLL 

SCMLLGGRKTTDIPLEGYLVLSPIQRICKYPLLLKE 

LAKRTPGKHPDHPAVQ\SALQAMKTVCSNINETK 

RQMEKLEALEAAA/QSHIEGWEGSNLTDICTQLL 

LQGTLLKISAGNIQERAFFLFDNLLVYCKRKSRV 

TGSKKSTKRTKSINGSLYIFRGRINTEVMEVENVE 

DGTGSPSPSLA 


3774 


A 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLI 

RVDGKGSIKELFPTGKQLEPLVAPLADGKVAVG 

QDDLTVVLNEEGICTQKCALNWTDIPVAMEHQP 

PYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFITSGG 

SNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFE 

LALQLAEMKDDSDSEKQQQIHHIKNLYAFNLFC 

QKRFDESMQVFAKLGTDPTHVMGLYPDLLPTDY 

RKQLQYPNPLPVLSGAELEKAHLALIDYLTQKRS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue.of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D^Aspartic Acid, 
E^Glutamic Acid, F=Phenyla1anine, G^Glycine, H<=Histidine, 
I^^Isoleucine, K»Lysine, L^Leucine, M^Methionlne, 
N»Asparagine, P^proline, Q=Glutamine, R^Argtnine, S»Serine, 
T-Threonine, V^Valinc, W=Tryptoplian, y«=Tyrosine, 
X=Unlcno>vn, *=Stop codon, possible nucleotide deletion, 
V=n)0$sible nucleotide insertion 










QLVPCKll^SDHQSSTSPLMEGTPTIKSKKKLLQn 

DTTLLKCYLHTNVALVAPLLRLENNHCHffiESEH 

VLKKAHKYSELIILYEKKGLHEKALQVLVDQSK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 

FKGLAIPYLEHIIHVWEETGSRFHNCLIQLYCEKV 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYRQK 

LLMFLEISSYYDPGRLICDFPFDGLLEERALLLGR 

MGKHEQALFIYVHILKDTRMAEEYCHKHYDRN 

KDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPK 

ANLQAALQVLELHHSKLDTTKALNLLPANTQIN 

DIRIFLEKVLEENAQKKRFNQVLKNLLHAEFLRV\ 

QEERILHQQVKCIITEEKVCMVCKKKIGNSAFAR 

.YPNGVVVHYFCS\KEVNPADT 


3775 


A 


1832 


839 


MSRARGALCRACLALAAALAALLLLPLPLPRAP 
APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

KNHGPRLRLLLRTWMSRARQQTFIFTDGDDPELE 

LQGGDRVINTNCSAVRTRQALCCKMSVEYDKFI 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSQD 

VYLGRPSLDHPIEATERVQGGRTVTTVKFWFAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEQVRL 

PDDCTVGYIVEGLLGARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNVVNVAGGFSLHQ 

DPTRFKSIHCLLYPDTDWCPRQKQGAPTSR 


3776 


A 


3 


796 


PRAKLGTRARNMAGQDAGCGRGGDDYSEDEGD 

SSVSRAAVEVFGKLKDLNCPFLEGLYITEPKTIQE 

LLCSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 

PTEVKIQEMTKLGHELMLCAPDDQELLKGCACA 

QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 

REKNEALLGELFSSPHLQMLLNPECDPWPLDMQ 

PLLNKQSDDWQWASASAKSEEEEKLAELARQLQ 

ESAAKLHALRTEYFAQHEQGAAAGAAMSAP 


3m 


A 


3 


413 


SEEDVIEGKTAVIEKJRIIKKRSSAGVVED/IGGEVQ 

NMLEGVGVDINKALLAKRKRLEMYTKASLRTSN 

QKIEHVWKTQQDQRQKLNQEYSQQFLTLFQQW 

DLDMQKAEEQEEKILVGIMIRFIINQVSSRNGQPS 

LLL 


3778 


A 


132 


788 


SRLPPPPPHLADGRAGARVPRSARLSRWWVQD 

WTHGPIVRPPAAARTMWVNPEEVLLANALWITE 

RANPYFILQRRKGHAGDGGGGGGLAGLLVGTLD 

VVLDSSARVAPYRILYQTPDSLVYWTIACG\GSR 

KEITEHWEWLEQNLLQTLSIFE^^E^^DITTFVRGKI 

QGIIAEYNKINDVKEDDDTEKFKEAIVKFHRLFG 

MPEEEKLVNYYSCSYWKG 


3779 


A 


2 


934 


CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 

NATEEIIQEVFEQCGDITAIRKSKKNFCHIRFAEEF 

MVDKAIYLSGYRMRLGSSTDKKDSGRLHVDFA 

QARDDFYEWECKQRMRAREERHRRKLEEDRLR 

PPSPPAIMHYSEHEAALLAEKLKDDSKFSEAM\Q 

VLLSWIERGEVNRRNSANQFYSMVQSANSHVRRL 

MNEKATHEQEMEEAKENFKNALTGILTQFEQIV 

AVFNASTRQKAWDHFSKAQRKNIDIWAKNHSEE 

LRNAQSEQLMGIRREEEMEMSDDENCDSPTKKM 

RVDESALGAP 


3780 


A 


1 


2535 


AAQAEREELAAGRMPGGGPQGAPAAAGGGGVS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»A!anine C=Cystcine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G-Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R'=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=^top codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 










HRAGSRDCLPPAACFRRRRLARRPGYMRSSTGP 

GIGFLSPAVGTLFRFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGD 

AHSWDTLLRKWEPVLRDCLLRMRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKEIEALQARMFVLEAKDQQLRRE 

lEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPETIRSLQERIKSLl^SLK 

EITTKVCMSEKFCSTLRKKVNDIETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

KLLVLSSR>IVKKLGSVKEDYNRLRREVEHQETA 

YETSVKENTMKYMETLKNKLCSCKCPLLGKVW 

EADLEACRLLIQCLQLQEARGSLSVEDERQMDD 

LEGAAPPIPPRLHSEDKRKTPLKESYILSAELGEK 

CEDIGKKLLYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 


3781 


A 


3 


995 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3782 


A 


1 


2649 


FRVPDSCPWLHSFTQLDPDLPRPESSTQEIGEELI 

NGVIYSISLRKVQLHHGGNKGQRWLGYENESAL 

NLYETCKVRTVKAGTLEKLVEHLVPAFQGSDLS 

YVTIFLCTYRAFTTTQQVLDLLFKRYGRCDALTA 

SSRYGCILPYSDEDGGPQDQLKNAISSILGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQLNMPGSDLER 

RAHLLLAQLEHSEPIEAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPVVAEN 

GLSEEKPHLLVFPPDLVAEQFTLMDAELFKKWP 

YHCLGSIWSQRDKKGKEHLAPTIRATVTQFNSV 

ANCVITTCLGNRSTKAPDRARWEHWffiVAREC 

RILKNFSSLYAILSALQSNSIHRLKKTWEDVSRDS 

FRIFQKLSEIFSDENNYSLSRELLIKEGTSKFATLE 

MNPKRAQKRPKETGHQGTVPYLGTFLTDLVML 

DTAMKDYLYGRLINFEKRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 
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SEQID 
NO: 


Method 


Predicted 
beginoing 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysleine, D=Asparlic Acid, 
£=Glutaniic Acid, F>=Pbeoylalanine, G^GIycine, H=His(idine, 
I=I$oIeucine, K->Ly$ine, L^Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q<°G!utamine, R=Arginine, S=Serine, 
T=Tlireonine, V=VaIine, W=Tryptoplian, Y=Tyrosine, 
X=lJnl<nown, *=Stop codon, /=possible nncleotide deletion, 
\=possible nucleotide insertion 










TSGSSHSKSCDQLRCGPYLSSGDIADALSVHSAG 

SSSSDVEEINISFVPESPDGQEKKFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCIIRVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQILS 

DDRKLKIPENANVFYAMNSTANYDFVLKKRTFT 

KGVKVKHGASSTLPRMKQKGLKIAKGIF 


3783 


A 


3 


869 


RSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGK 

RNKLRVYYLSWLRNKILHNDPEVEKKQGWTTV 

GDMEGCGHYRWKYERDCFLVIALKSSVEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQIT 

PHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIK 

DVVLQWGEMPTSVAYICSNQIMGWGEKAIEIRS 

VETGHLDGVFMHKRAQRLKFLCERNDKVFFASV 

RSGGSSQVYFMTLNRNCIMNW 


3784 


A 


1213 


457 


LSPRQVDGLAGLQKGLSLSLLYQFLMNGIRLGTY 

GLAEAGGYLHTAEGTHSPARSAAAGAMAGVMG 

AYLGSPIYMVKTHLQAQAASEIAVGHQYKHQG 

MFQALTEIGQKHGLVGLWRGALGGLPRVIVGSS 

TQLCTFSSTKDLLSQWEFPPQSWKLALVAAMM 

SGIAVVLAMAPFDVACTRLYNQPHRCTGQGP\LY 

RGE.DALLQTARTEGIFGMYKGIGASYFRLGPHT1 

LSLFFWDQLRSLYYTDTK 


3785 


A 


193 


813 


RRRGRHSLCGGKMLAYCVQDATVVDVEKRKNP 

SKHYVYIINVTWSDSTSQT1YRRY\SKFFDLQMQL 

LD\KFPI\ESGQKDPKQRIIPFLPGK1LFRRSHIRDV 

AVKRLKPIDEYCRALVRLPPHISQCDEVFRFFEAR 

PEDVNPPKEQGPSPPDAVLPYGVNKGKQELKAG 

PNWPGRTHHVVNCVTQKCLFVFHFKFSSSGNKE 

SKSL 


3786 


A 


3785 


1632 


EFVGRAASTTVVTRIAWRMADAGIRRVVPSDLY 

PLVLGFLRDNQLSEVANKFAKATGATQQDANAS 

SLLDIYSFWLNRSAKVPERKLQANGPVAKKAKK 

KASSSDSEDSSEEEEEVQGPPAKKAAVPAKRVGL 

PPGKAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSDSDSSSEDEPP 

KNQKPKITP\VTVKAQTKAPPKPARA\APKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAKAPVKAATTPTRKSSSSEDSSSDEEEEQKKPM 

KNKPGPYSSVPPPSAPPPKKSLGTQPPKKAVEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAVVSKATTK 

PPPAKKAAESSSDSSDSDSSEDDEAPSKPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKLLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 

SAVKKKPQKVAGGAAPSKPASAKKGKAESSNSS 

SSDDSSEEEEEKLKGKGSPRPQAPKANGTSALTA 

QNGKAAKNSEEEEEEKKKAAVVVSKSGSLKKR 

KQNEAAKEAETPQAKKIKLQTPNTFPKRKKGEK 

RASSPFRRVREEEIEVDSRVADNSFDAKRGAAGD 

WGERANQVLKFTKGKSFRJHEKTKKKRGSYRGG 

SISVQVNSIKFDSE 


3787 


A 


3 


5078 


IPEG/RALSAEHTSSLVPSLHITTLGQEQAILSGAV 
PASPSTGTADFPSILTFLQPTENHASPSPVPEMPTL 
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S£QU> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C-Cysteinc, D=:Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=GIycine, H=Histidine, 
I°>Iso!eucine, K=Lysine, Lr=Leuclne, M==Mcthionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=lInknown, *=Stop codon, /=possibIe nucleotide deletion, 
\=pos5ible nucleotide insertion 










PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 

KKDSVTAELGKNEEANVTIPLQAFPRKEVLSLHT 

VNGFVSDFSTGSVSSPnTAPRTNPLPSGPPLPSILS 

IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 

APKQVRASPSSMDVYDSLTIGDMKKPATTDVFW 

SSLSAETGSLSTESIISGLQQQTNYDLNGHTISTTS 

AVETHLAPTAPPNGLTSAADAIKSQDFKDTAGHS 

VTAEGFSIQDLVLGTSmQPVQQSDMTMVGSHID 

LWPTSNNNHSRDFQTAEVAYYSPTTRHSVSHPQ 

LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 

GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 

VFPRTSSRVLRASQHPKKWTADTVSSKVQPTAA 

AAVTLFLRKSSPPALSAALVAKGTSSSPLAVASG 

PAKSSSMTTLAKNVTNKAASGPKRTPGAVHTAF 

PFTPTYMYARTGHTTSTHTA/IARKHGHCLWPVV 

YNLP/PP/GKPQAMHTGLPNPTNLEMPRASTPRPL 

TVTAALTSITASVKATRLPPLRAENTDAVLPAAS 

AAVVTTGKMASNLECQMSSKLLVKTVLFLTQRR 

VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 

LEYSHNVTVGYYATKGKLVYLPAWIEMLGVY 

GVSNVTADLKQHTPHLQSVAVLASPWNPQPAG 

YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 

FQDAERKVLNTKSNLTIQIVSTSNASQAVTLVYV 

VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 

TIAEPLEYPNLDISETTRDYWVITVLQGVDNSLV 

GLHNQSFARVMEQRLAQLFMMSQQQGRRFKRA 

TTLGSYTVQMVKMQRVPGPKDPAELTYYTLYN 

GKPLLGTAAAKILSTIDSQRMALTLHHVVLLQAD 

PWKNPPNNLWIIAAVLAPIAVVTVIIIIITAVLCR 

KNKNDFKPDTMINLPQRAKPVQGFDYAKQHLG 

QQGADEEVIPVTQETVVLPLPIRDAPQERDVAQD 

GSTIKTAKSTETRKSRSPSENGSVISNESGKPSSGR 

RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV 

LFDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 

TSDRSQESSAVLNGEVNKALKQKSDIEHYRNKL 

RLKAKRKGYYDFPAVETSKGLTERKKMYEKAP 

KEMEHVLDPDSELCAPFTESKNRQQMKNSVYRS 

RQSLNSPSPGETEMDLLVTRERPRRGIRKSGYDT 

EPEIIEETNIDRVPEPRGYSRSRQVKGHSETSTLSS 

QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 

EAYGSAQHLPYSEVVTSAPGTMTRPRAGVQWVP 

TYRPEMYQYSLPRPAYRFSQLPEMVMGSPPPPVP 

PRTGPVAVASLRRSTSDIGSKTRMAESTGPEPAQ 

LHDSASFTQMSRGPVSVTQLDQSALNYSGNTVP 

AVFAIPAANRPGFTGYFIPTPPSSYRNQAWMSYA 

GENELPSQWADSVPLPGYffiAYPRSRYPQSSPSRL 

PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 

SDAPLTWSTAALVKAIREEVAKLAKKQTDMFEF 

QV 


3788 


A 


2 


1737 


MKGLYTDAEMKSDNVKDKDAKISFLQKAIDVV 

VMVSGEPLLAKPARIVAGHEPERTNELLQIIGKC 

CLNKLSSDDAVRRVLAGEKGEVKGRASLTSRSQ 

ELDNKIsTVREEESRVHKNTEDRGDAEIKERSTSRD 

RKQKEELKEDRMPREKDKDKEKAKENGGNRHR 

EGERERAKARARPDNERQKDRGNRERDRDSERK 



436 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, M«Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *-Stop codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion 










KETERKSEGGKEKERLRDRDRERDRDKGKDRDR 

RRVKNGEHSWDLDRENNREHDKPEKKSASSGE 

MSKKLSDGTFKDSKAETETEISTRASKSLTTKTS 

KRRSKNSVEGDSTSDAEGDAGPAGQDKSEVPET 

PEIPNELSSNIRRIPRPGSARPAPPRVKRQDSMEAL 

QMDRSGSGKTVSNVITESHNSDNEEDDQFWEA 

APQLSEMSEIEMVTAVELEEEEKHGGLVKKILET 

KKDYEKLQQSPKPGEKERSLFESAWKKEKDIVS 

KEIEKLRTSIQTLCKSALPLGKIMDYIQEDVDAM 

QNELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAELA\ELEQLIKD\Q\QDKICAVKANILKNEEKIQ 

KMVYSINLTSRR 


3789 


A 


1 


4369 


MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDFNWEQWTLTKPTSDPWMPSGS 

FMLVNASGRPEGQRAHLLLPQLKENDTHCIDFH 

YFVSSKSNSPPGLLNVYVKVNNGPLGNPIWNISG 

DPTRTWNRAELAISTFWPNFYQVIFEVITSGHQG 

YLAIDEVKVLGHPCTRTPHFLRIQNVEVNAGQFA 

TFQCSAIGRTVAGDRLWLQGIDVRDAPLKEIKVT 

SSRRFIASFNVWTTKRDAGKYRCMIXRTEGGVGI 

SNYAELWVKEPPVPIAPPQLASVGATYLWIQLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTEYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEVVEVKSRQITIRWEPFGY 

NVTRCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTITNLSPYT>rVSVKLILMNPEGRKESQ 

ELIVQTDEDLPGAVPTESIQGSTFEEKIFLQWREP 

TQTYGVITLYEITYKAVSSFDPEIDLSNQSGRVSK 

LGNETHFLFFGLYPGTTYSFTIRASTAKGFGPPAT 

NQFTTKISAPSMPAYELETPLNQTDNTVTVMLKP 

AHSRGAPVSVYQIWEEERPRRTKKTTEILKCYP 

VPIHFQNASLLNSQYYFAAEFPADSLQAAQPFTIG 

DNKTYNGYWNTPLLPYKSYRIYFQAASRANGET 

KIDCVQVATKGAATPKPVPEPEKQTDHTVKIAG 

VIAGILLFVIIFLGVVLVMKKRKLVAKKRKETMSS 

TRQEIDLWIGELNGPRSYAEQGTBCLATRAFSFMD 

THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE 

THTMASDTSSLVQSHTYKKREPADVPYQTGQLH 

PAIRVADLLQHITQMKCAEGYGFKEEYESFFEGQ 

SAPWDSAKKDENRMKNRYGNIIAYDHSRVRLQT 

lEGDTNSDYINGNYTOGYHRPNHYIATQGPMQET 

r^DFWRMVWHENTASIIMVTNLVEVGRVKCCK 

YWPDDTEIYKDIKVTLIETELLAEYVIRTFAVEKR 

GVHEIREIRQFHFTGWPDHGVPYHATGLLGFVR 

QVKSKSPPSAGPLWHCSAGAGRTGCFIVIDIML 

DMAEREGVVDIYNCVRELRSRRVNMVQTEEQY 

VFIHDAILEACLCGDTSVPASQVRSLYYDMNKLD 

PQTNSSQIKEEFRTLNMVTPTLRVEDCSIALLPRN 

HEKNRCMDILPPDRCLPFLITIDGESSNYINAALM 

DSYKQPSAFIVTQHPLPNTVKDFWRLVLDYHCTS 

VVMLNDVDPAQLCPQYWPENGVHRHGPIQVEF 

VSADLEEDIISRIFRIYNAARPQDGYRMVQQFQFL 

GWPMYRDTPVSKRSFLKLIRQVDKWQEEYNGG 

EGRTWHCLNGGGRSGTFCAISIVCEMLRHQRTV 

DWHAVKTLRNNKPNMVDLLDQYKFCYEVALE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AlaDine C=Cystelne, D=Aspartic Acid, 
E^GIutamic Acid, F-Phenylalanine, G^GIycine, H-Histidine, 
I*>Isoleucine, K^Lysine, L=Leucine, M=Metbionlne, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=TIireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=l)nknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










YLNSG 


3790 


A 


261 


485 


EEQTPLHIASRLGKTEIVQLLLQHMAHPDAATTN 
GYTPLHISAREGQV\DV\ASVLLGRQGAAHSFRLT 
KVRRMTS 


3791 


A 


1 


5874 


LPPVTMSGKYIMEEHDSYSDQVWSIDELPSKQG 

YYLQGNYLRCVAEVGSFEHNLTTDLLNHLVFVQ 

KVFMKEVNEVIQKVSGGEQPIPLWNEHDGTADG 

DKPKDLLYSLNLQFKGIQVTATTPSMRAVRFETG 

LIELELSNRLQTKASPGSSSYLKLFGKCQVDLNL 

ALGQIVKHQVYEEAGSDFHQVAYFKTRIGLRNA 

LREEISGSSDREAVLITLNRPIVYAQPVAFDRAVL 

FWLNYK\AAYDNWNEQRMALHKDIHMATKEVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTGSALVLTIESTLITACSSESLVSK 

GHFKNFCIRFADGFETSWDDWKPEIHGDLVMNA 

CWPDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WKMCGIDVHMDPNIGKRLNALGNTLTTLTGEED 

IDDIADLNSVNIADLSDEDEVDTMSPTIHTEATDY 

RRQAASASQPGELRGRKIMKRIVDIRELNEQAKV 

IDDLKKLGASEGTINQEIQRYQQLESVAVNDIRR 

DVRKKLRRSSMRAASLKDKWGLSYKPSYSRSKS 

ISASGRPPLKRMERASSRVGETEELPEIRVDAASP 

GPRVTFNIQDTFPEETELDLLSVTIEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGEPFQTEEGRRDDSLSSTS 

EDSEKDEKDEDHERERFYIYRKPSHTSRKKATGF 

AAVHQLFTERWPTTPVNRSLSGTATERNIDFELD 

IRVEIDSGKCVLHPTTLLQEHDDISLRRSYDRSSR 

SLDQDSPSKKKKFQTNYASTTHLMTGKKVPSSL 

QTKPSDLETTVFYIPGVDVKLHYNSKTLKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPLPST 

VQSKTNTLLPPQPPPIPAAKGKGSGGVKTAKLYA 

WVALQSLPEEMVISPCLLDFLEKALETIPITPVER 

NYTAVSSQDEDMGHFEIPDPMEESVTTSLVSXSSTS 

AYSSFPVDVVVYVRVQPSQIKFSCLPVSRVECML 

KLPSLDL VFSSNRGELETLGTTYPAETLSPGGNA 

TQSGTKTSASKTGIPGSSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSINLE 

FVKVSLSRIRRSGGASFFESQSVSKSASKMDTTLl 

NISAVCDIGSASFKYDMRRLSBILAFPRAWYRRSI 

ARRLFLGDQTINLPTSGPGTPDSIEGVSQHLSPESS 

RKAYCKTWEQPSQSASFTHMPQSPNVFNEHMTN 

STMSPGTVGQSLKSPASIRSRSVSDSSVPRRDSLS 

KTSTPFNKSNKAASQQGTPWETLWFAINLKQL 

NVQMNMSNVMGNTTWTTSGLKSQGRLSVGSNR 

DREISMSVGLGRSQLDSKGGWGGTIDVNALEM 

VAfflSEHPNQQPSHKIQITMGSTEARVDYMGSSIL 

MGIFSNADLKLQDEWKVNLYNTLDSSITDKSEIF 

VHGDLKWDIFQVMISRSTTPDLIKIGMKLQEFFT 

QQFDTSKRALSTWGPVPYLPPKTMTSNLEKSSQE 

QLLDAAHHRHWPGVLKWSGCHISLFQIPLPEDG 

MQFGGSMSLHGNHNITLACFHGPNFRSKSWALF 

HLEEPNIAFWTEAQKIWEDGSSDHSTYTVQTLDF 

HLGHNTMVTKPCGALESPMATITKITRRRHENPP 

HGVASVKEWFNYVTATRNEELNLLKNVDANNT 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location ■ 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cystelne, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIaninc, G=Glycine, H=Histidine, 
I=Isoleudne, K^Lysinc, L=Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, CNGIutamine, R^'Arginine, S=Serinc, 
T=Threonine, V=Vaiinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=pos$ible nucleotide deletion, 
\=possible nucleotide Insertion 










ENSTTVKNSSLLSGFRGGSSYNHETETIFALPRM 

QLDFKSIHVQEPQEPSLQDASLKPKVECSWTEF 

TDfflCVTMDAELIMFLHDLVSAYLKEKEKAIFPP 

RBLSTRPGQKSPIIIHDDNSSDKDREDSITYTTVDW 

RDFMCNTWHLEPTLRLISWTGRKIDPVGVDYILQ 

KLGFHHARTTIPKWLQRGVMDPLDKVLSVLIKK 

LGTALQDEKEKKGKDKEEH 


3792 


A 


1 


364 


QNGSTPLHHAASKNRHEIALMLLEGGANPDGKD 
HYEATAKHQATAKGNFKMIHILLYYKASTIIQDT 
EGNTPPHLVCD\RVEEAKLLVSQGA/SIYIENKEE 
KDP/LQVAKGALGLVLKRMVEG 


3793 


A 


2 


340 


DIVPNPKMAPLGDEAPTLEKVLTPELSEEEVSTR 
DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 
PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 
KSGPASRPAL 


3794 


A 


421 


158 


SYWVGEDYTYKFFEVILIDPFHKAIRRNOPDTQWI 
SKAVYKHREMCGLTSTGRKSHGLEKDRMFPHAI 
GGSCRAA*RRRKTLQFPCYH 


3795 


A 


24 


.592 


GGMDSRVSGTTSNGETKPVYPVMEKKEEDGTLE 

RGHWNNKMEFVLSVAGEIIGLGNVWRFPYLCYK 

NGGGAFFIPYLVFLFTCGIPVFLLETALGQYTSQG 

GVTAWRKICPIFEGIGYASQMIVILLNVYYIIVLA 

WALFYLFSSFTIDLPWGGCYHEWNTEHCMEFQK 

TNGSLNGTSENATSPVIEFW 


3796 


A 


3 


592 


KPASTYSTSQPSMAPLLPIRTLPLILILLALLSPGA 

ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 

LMVRRANDSKVVTSSFWPPCRGRRELVSWDS 

GAGFTVTRLSAYQVTNLVPGTKFYISYLVKKGT 

ATESSREIPMFTLPRRNMESIGLGMARTGGMVVI 

TVLLSVAMFLLVLGFnALALGSRK 


3797 


A 


1 


1556 


ATRLLRGSGSWGCSRLRFGPPAYRRFSSGGAYPN 

PLSSPLPGVPKPVFATVDGQEKFETKVTTLDNGL 

RVASQNKFGQFCTVGILINSGSRYEAKYLSGIAH 

FLEKLAFSSTARFDSKDEILLTLEKHGGICDCQTS 

RDTTMYAVSADSKGLDTVVALLADWLQPRLT 

DEEVEMTRMAVQFELEDLNLRPDPEPLLTEMIHE 

AAYRENTVGLHRFCPTENVAKINREVLHSYLRN 

YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 

AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 

GPTPIPELTHMVGLESCSFLEEDFIPFAVLNMNIM 

GGGGSFSAGGPGKGMFSRLYLNVLNRHHWMYN 

ATSYHHSYEDTGLLCIHASADPRQVREMVEnXK 

EFILMGGTVDTN^LERAKTQLTSMLMMNLESRP 

VIFEDVGRQVLATRSRKLPHELCTLIRNVKPEDV 

KRVASKMLRGKPAVAALGDLTDLPTYEfflQTAL 

SSKDGRLPRTYRLFR 


3798 


A 

• 


73 


759 


KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 

QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 

LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 


3799 


A 


73 


759 


KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OfCysteinc, B'^Aspartic Acid, 
E«Glutamic Acid, F=PhenylaIanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methioninc, 
N=Asparagine, P=ProIine, Q=GIutaminc, R«=Arginine, S=Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *-Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 


3800 


A 


250 


1032 


GIFRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 

TMGFGDLKSPAGLQVLNDYLADKSYIEGYVPSQ 

ADVAVFEAVSSPPPADLCHALRWYNHIKSYEKE 

KASLPGVKKALGKYGPADVEDTTGSGATDSKD 

DDDIDLFGSDDEEESEEAKRLREERLAQYESKKA 

KKPALVAKSSILLDVKPWDDETDMAKLEECVRS 

IQADGLVWGSSKLVPVGYGIKKLQIQCVVEDDK 

VGTDMLEEQITAFEDYVQSMDVAAFNKI 


3801 


A 


155 


656 


SREMELVTFRDVADEFSPEEWKCLDPAQQNLYR 

DVMLENYRNLVSLGFVISNPDLVTCLEQIKEPCN 

LKIHETAAKPPAICSPFSQDLSPVQGIEDSFHKLIL 

KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 

VYQCLSTTQSKIFQCNTCVRVFSTSSHSNKHK 


3802 


A 


1 


1428 


VTVSPETHMDLTKGCVTFEDIAIYFSQDEWGLLD 

EAQRLLYLEVMLENFALVASLGCGHGTEDEETP 

SDQNVSVGVSQSKAGSSTQKTQSCEMCVPVLKD 

ILHLADLPGQKPYLVGECTNHHQHQKHHSAKKS 

LKRDMDRASYVKCCLFCMSLKPFRKWEVGKDL 

PAMLRLLRSLVFPGGKKPGTITECGEDIRSQKSH 

YKSGECGKASRHKHTPVYHPRVYTGKKLYECSK 

CGKAFRGKYSLVQHQRVHTGERPWECNECGKF 

FSQTSHLNDHRRIHTGERPYECSECGKLFRQNSS 

LVDHQKIHTGARPYECSQCGKSFSQKATLVKHQ 

RVHTGERPYKCGECGNSFSQSAILNQHRRIHTGA 

KPYECGQCGKSFSQKATLIKHQRVHTGERPYKC 

GDCGKSFSQSSILIQHRRIHTGARPYECGQCGKSF 

SQKSGLIQHQVVHTGERPYECNKCGNSFSQCSSL 

IHHQKCHNT 


3803 


A 


193 


617 


LFPFLGSESKNGEADSSDKEMKHGQKSPTGKQTS 

QHLKRLKKSGLGHLKWTKAEDIDIETPGSILVNT 

NLRALINKHTFASLPQHFQQYLLLLLPEVDRQMG 

SDGILRLSTSALNNEFFAYAAQGWKQRLAEGKF 

VFSIIM 


3804 


A 


197 


479 


SSSRASPPEHPSSQAHCGPLVLSHACPEVTNKWS 

TGSSSSPNSSWVSSPLQPEGLSGSSRMKGGSATKI 

LLETLLLAAHMTADQGIASSQRCLL 


3805 


A 


1 


385 


QSADTLFPGDn^nWVSGLFSAVTLQDTVSDRLAS 
EELPSTAVPTPATTPAPAPAPAPATAPALVSAAT 
KERTESEVPPRPASPKVTRSPPETAAPVEDMARR 
SELAVGGEEGTEGGRGEGTGSPMSSY 


3806 


A 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNLLQ 

PQAPGHDMTSIPFPGDRLLQVDGVILCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

GDERTAVSLVTALPGRPSSCVSVTDGPKF*SSN* 

KRIANGLGFSFVQMEKESCSHLKSDLVRIKRLFP 

GHPAEENGAIAAGDIILGREWEGPRKASSSRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCTDSCTSPILGSRGQLGGTVPPQMQGKAWGL 

RPESSQKAIREGTMGAKTERDLGPVP 
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S£QID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AlaDine C=Cysteine, II=Aspartic Add, 
E°=Glutainic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I^lsoleucine, K=<Lysine, L^Leucine, M=IVIetliionine, 
r^Asparagine, P=Proline, Q^GIutamine, R=Argiiiine, S=Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^possiUe nncleoiide deletion, 
V^possible nucleotide insertion 


3807 


A 


656 


1238 


RCPSLLPPSWPLPTLQTLTRTPGNKAIAGGAGLW 

AVLWGSERTPPYR*GN*NQRGAVPCLRPHRLRP 

QDKFLVLASDGLWDMLSNEDVVRLWGHLAEA 

DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 

DQNAATJRLIRHAIGNNEYGEMEAERLAAMLTLP 

EDLARMYRDDITVTVVYFNSESIGAYYKGG 


3808 


A 


26 


2195 


SQYSESVAGRQASPERLLGSYHAMASTVEGGDT 

ALLPEFPRGPLDAYRARASFSWKELALFTEGEG 

MLRFKKTIFSALENDPLFARSPGADLSLEKYREL 

NFLRCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 

MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIQ 

KIFRMEIFGCFALTELSHGSNTKAIRTTAHYDPAT 

EEFIIHSPDFEAAKFWVGNMGKTATHAWFAKL 

CVPGDQCHGLHPFIVQIRDPKTLLPMPGVMVGDI 

GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 

VTPEGTYVSPFK£)VRQRFGASLGSLSSGRVSIVSL 

AILNLKLAVAIALRFSATRRQFGPTEEEEIPVLEY 

PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 

GLASGDRSARQAELGREIHALASASKPLASWTT 

QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 

CTYEGDNNILLQQTSNYLLGLLAHQVHDGACFR 

SPLKSVDFLDAYPGILDQKFEVSSVADCLDSAVA 

LAAYKWLVCYLLRETYQKLNQEKRSGSSDFEAR 

NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 

PPSLRAVLGRLSALYALWSLSRHAALLYRGGYF 

SGEQAGEVLESAVLALCSQLKJDDAVALVDVIAP 

PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 

ASWWPEFSVNKPVIGSLKSKL 


3809 


A 


117 


830 


CFGIMERVGCTLTTTYAHPRPTPTNFLPAISTMAS 

SYRDRFPHSNLTHSLSLPWRPSTYYKVASNSPSV 

APYCTRSQRVSENTMLPFVSNRTTFFTRYTPDDW 

YRSNLTNYQESNTSRHNSEKLRVDTSRLIQDKYQ 

QTRKTQADTTQNLGERVNDIGFWKSEIIHELDEM 

IGETNALTDVKKRLERALMETEAPLQVARECLF 

HREKRMGIDLVHDEVEAQLLTVNVGEMHQSQA 

A 


3810 


A 


3 


518 


VIQELEGGSGADLGEHSCRPASQPRFPRPAEARS 
HPATRRPASGPAMGKTNSKLAPEVLEDLVQNTE 
FSEQELKQWYKGFLKDCPSGILNLEEFQQLYIKF 
FPYGDASKFAQHAFRTFDKNGDGTIDFREnCAL 
SVTSRGSFEQKLNWAFEMYDLDGDGRITRJLEML 
EIIE 


3811 


A 


81 


1147 


GCGYGCSGAGGAAIGEPMAKWGEGDPRWIVEE 

RADATNVNNWHWTERDASNWSTDKLKTLFLAV 

QVQNEEGKCEVTEVSKLDGEASINNRKGKLIFFY 

EWSVKLNWTGTSKSGVQYKGHVEIPNLSDENSV 

DEVEISVSLAKDEPDTNLVALMKEEGVKLLREA 

MGIYISTLKTEFTQGMILPTMNGESVDPVGQPAL 

KTEERKAKPAPSKTQARPVGVKIPTCKITLKETFL 

TSPEELYRVFTTQELVQAFTHAPATLEADRGGKF 

HMVDGNVSGEFTDLVPEKHIVMKWRFKSWPEG 

HFATITLTFIDKNGETELCMEGRGffAPEEERTRQ 

GWQRYYFEGDCQTFGYGARLF 


3812 


A 


20 


558 


PCGTAASTHAYDRRAKCRQQQQQQQNGGQNKV 
RPAKKKTSPAREVSSESGTSGQFTPPSSTSVPTIAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine> D^'Aspartic Acid, 
£=Giutamic Acid, F^Phenylalanine, G-GIycine, H=Histidine, 
I»Isoleucine, K=Lysine, L^Leucine, M=Mettiionine, 
N'Asparagine, P=Proline, Q=Glutaraine, R=Arginine, SNSerioC) 
T=Threonjne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPGATLSPMGTNAVTSHLNQSPASLSTQGYGAS 
KLWGFNFNH 


3813 


A 


1 


1016 


CTEPPRRSTRTPAALASLRPYTDYVVVSDQILQES 

EDFFTLIESHEGKPLKLMVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHRIPTQPPSYHKKPP. 

GTPPPSALPLGAPPPDALPPGPTPEDSPSLETGSRQ 

SDYMEALLQAPGSSMEDPLPGPGSPSHSAPDPDG 

LPHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELTTTAVSTSGPEDICSSSSSHE 

RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 

LPDSLTSAASPEDGLSAELLEAQAEEEPASTEGLD 

TGTEAEGLDSQAQISTTE*HPGL*QGP 


3814 


A 


2 


884 


VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 

CCIPHASTGCRPMAERGELDLTGAKQNTGVWLV 

KVPKYLSQQWAKASGRGEVGKLRIAKTQGRTE 

VSFTLNEDLANIHDIGGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIVVQRAECRPAASE 

NYMRLKRLQIEESSKPVRLSQQLDKVVTTNYKP 

VANHQYNIEYERKKKEDGKRARADKQHVLDML 

FSAFEKHQYYNLKDLVDITKQPVVYLKEILKEIG 

VQNVKGIHKNTWELKPEYRHYQGEEKSD 


3815 


A 


17 


411 


NIGDWEDIGKSPERIIQYYGPATWAQDGSRGYCT 
PIYMLNHIIRLQAVLEIIMNERANALDLLAQQTTK 
MRNANYQNRLALDYLLAHEGGV*GKFSLTNCC 
LEIDDNGKAIMEITARMRKLAHIPVQTWER 


3816 


A 


3 


1172 


SHWQRRDRRCVRNMAERGRKRPCGPGEHGQRI 

EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 

EEQAKRLEEEEAAAEKEDRGRPYTLSVALPGSIL 

DNAQSPELRTYLAGQIARACAIFCVDEIVVFDEE 

GQDAKTVEGEFTGVGKKGQACVQLARILQYLEC 

PQYLRKAFFPKHQDLQFAGLLNPLDSPHHMRQD 

EESEFREGVVVDRPTRPGHGSFVNCGMKKEVKI 

DKM-EPGLRVTVRLNQQQHPDCKTYHGKVVSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLTIGTSERGSDVASAQLPNFRHALVVFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

TIRTEEAILISLAALQPGLIQAGARHT 


3817 


A 


246 


1197 


FLSAGMSNFTHYAYLLMBESLMLGKVPPHVPSH 

HFIFHDDGSARQKGESDYKVIIQQWFSKSGPWTT 

SSNVTWGLLELQQSISESAVLTBPPGDSGAGSNLI 

TMFLRNRKETDLCSGRSKVNRGWNSGRCKQRG 

KTEQPGEPLEHVYVTIKHAVALESRHQKGELQC 

LIKMCIPLSKPLQMFFSPPHWEAWLQRVQQLAK 

ntryfrqrlqemgfiiygnenasvvplllympg 
kvaafArhmlekkigvvwgfpatplaeararf 
cvsaahtremldtvlealdemgdllqlkysrh 
kksarpelydetsfeled 


3818 


A 


215 


789 


npqssssegsseifqvnghnrllvqrsevtqapg 

qytvdveghgctfiqatlkynvllpkkasgfsls 

leivknysstafdltvtlkytgirnkss1s4vvidv 

kmlsgftptmssieelenkgqvmktevkndhvl 

fylenvfgradsftfsveqsnlvfniqpapgmvy 

dyyekeeyalafyhinsssvse 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location • 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^Glycine, H=HisUdine, 
Msoleucine, K=Lysine, Lp'Leucinc, M'^Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Atginine, S=Serln^ 
T=Tlireonine, V=Valine, W=n'ryptophaD, Y=Tyrosine, 
X^Unknown, *=Stop codon, A=possiUe nucleotide deletion, 
\=possible nucleotide insertion 


3819 


A 


1 


1483 


RIPDSUSRGVQGLPRDTASLSTTPSESPRAQATSR 

LSTASCPTPKVQSRCSSKENILRASHSAVDITKVA 

RRHRMSPFPLTSMDKAFITVLEMTPVLGTEIINYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

ADGPGDRFnGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAWQVEDTELIRESDDGTEELEVRIL 

VQARPGQDIRPIGHDIKRGECVLAKGTHMGPSEI 

GLLATVGVTEVEVNKFPWAVMSTGNELLNPED 

DLLPGKIRDSNRSTLLATIQEHGYPTINLGIVGDN 

PDDLLNALNEGISRADVIITSGGVSMGEKDYLKQ 

VLDIDLHAQIHFGRVFMKPGLPTTFATLDIDGVR 

KIIFALPGNPVSAVVTCNLFWPALRKMQGILDP 

RPTIIKARLSCDVKLDPRPEYHRCILTWHHQEPLP 

WAQSTGNQMSSRLMSMRSANGLLMLPPKTEQY 

VELHKGEVVDVMVIGRL 


3820 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCBDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGVVYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 

FFTTFAL 


3821 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNILGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS - 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTQLLVPAfflFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGWYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 

FFTTFAL 


3822 


A 


2502 


1S40 


MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALVVKALKAFVRDPAPTKPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=Glntamic Acid, F^Phenylaianine, G^lycine, H=Histidine, 
I=Isoleucine, K=Lysine, Ir°Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R°Arginine, S^Serine, 
T=Tlireonine, V=Valiiie, W-Tryptophan, Y=Tyrosine, 
X^Unknown, *=$top codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










SHIERYKKDLKSWVQGNLTACGRSLFLFDEMDK 

MPPGLMEVLRPFLGSSWWYGTNYRKAIFIFISN 

TGGEQINQVALEAWRSRRDREEILLQELEPVISR 

AVLDNPHHGFSNSGIMEERLLDAVVPFLPLQRHH 

VRHCVLNELAQLGLEPRDEVVQAVLDSTTFFPE 

DEQLFSSNGCKTVASRIAFFL 


3823 


A 


1 


3174 


YGCEKTTEGRIPLKNIYRLFSADRKRVETALEAC 

SLPSSRNDSnPQEDFTPEVYRVFLNNLCPRPElDNI 

FSEFGAKSKPYLTVDQMMDFINLKQRDPRLNEIL 

YPPLKQEQVQVLIEKYEPNNSLARKGQISVDGFM 

RYLSGEENGWSPEKLDLNEDMSQPLSHYFINSS 

HNTYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPVITHGFTMTTEISFKEVIEAIAEC 

AFKTSPFPILLSFENHVDSPKQQAKMAEYCRLIFG 

DALLMEPLEKYPLESGVPLPSPMDLMYKILVKN 

KKKSHKSSEGSGKKKLSEQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

amateemsnlvnyiqpvkfesfeisigawksfem 

ssfvetkgleqltkspvefveynkmqlsriypkg 

trvdssnympqlfwnagcqmvalnfqtmdla 

mqinmgmyeyngksgyrlkpefmrrpdkhfdp 

ftegivdgivantlsvkiisgqflsdkkvgtyvev 

dmfglpvdtrrkafktktsqgnavnpvweeepi 

vfkkwlptlaClriavyeeggkfighrilpvqai 

rpgyhyiclrnernqpltlpavfvyievkdyvpd 

tyadviealsnpiryvnlmeqrakqlaaltlede 

eevkkeadpgetpseapsearttpaengvnhttt 

ltpkppsqalhsqpapgsvkapaktedliqsvlte 

veaqtieelkqqksfvklqkkhykemkdlvkr 

hhkkttdlikehttkyneiqndylrrraaleks 

akkdskkksepsspdhgsstieqdlaaldaemtq 

klidlkdkqqqqllnlrqeqyysekyqkrehik 

lliqkltdvaeecqnnqlkklkeicekekkelkk 

kmdkkrqekiteakskdksqmeeektemirsyi 

qevvqyikrleeaqskrqeklvekhkeirqqild 

ekpklqveleqeyqdkfkrlpleilefvqeamkg 

kisedsnhgsaplslssdpgkvnhktpsseelggd 

ipgkefdtpl 


3824 


A 


1 


426 


ilhwfvhrwsgrnnrekigvhvgfeeilnmepy 
ccretlkslrpecfiydlsavvmhhgkgfgsgh 
ytaycynseggfwvhcndsklsmctmdevcka 
qayilfytqrvtenghskllppelllgsqhpned 

ADTSSNEILS 


3825 


A 


3 


364 


girakfpnkipvwerypretflppldktkflvpq 
eltmtqflsiirsrmvlrateafyllvnnkslvs 
msatmaeiyrdykdedgfvymtyasqetfgcle 
saaprdgssledrplhpl 


3826 


A 


1 


1237 


pekkferecreaekaqqsyerldndtnatkadv 

ekakqqlnlrthmadenkneyaaqlqnfngeq 

hkhfyvvipqiykqlqemderrtiklsecyrgfa 

dserkvipiiskclegmilaaksvderrdsqmw 

dsfksgfeppgdfpfedysqhiyrtisdgtisaskq 

esgkmdakttvgkakgklwlfgkkpkgpaled 

fshlppeqrrkklqqridelnrelqkesdqkdal 

nkmkdvyeknpqmgdpgslqpkij^eimnnidr 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location * 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E*=Glutamic Acid, F=Phenylalanine, G-GIycine, H^^HIstidine^ 
I«Isoleucine, K»Lysine, L^Leucine, M^Methionine, 
N^Asparaglne, P^Proline, Q=Glutamine, R^Arginine, S^erine, 
T^Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *^top codon, ^possible nucleotide deletion, 
\=p055ible nucleotide Insertion 










LRMEIHKNEAWLSEVEGKTGGRGDRRHSSDINH 
LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 
FDDEFEDDDPLPAIGHCKAIYPFDGHNEGTLAMK 
EGEVLYIIEEDKGDGWTRARRQNGEEGYVPTSYI 
DVTLEKNSKGS 


3827 


A 


2 


1584 


INPVSSAVNGEAHSSHETRGQNSNALPSVLLELL 

SQSCLIPAMSSYLRNDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLLAKMKTCVDTYTNRLRSKRENVKTGVKP 

DASDQEPEGLTLLVPDIQKTAEIVYAATTSLRQA 

NQEKKLGEYSKKAAMKPKPLSVLKSLEEKYVAV 

MKKLQFDTFEMVSEDEDGKLGFKVNYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLDIMKVLITGPADTPYANGCFEFDVY 

FPQDYPSSPPLVNLETTGGHSVRFNPNLYNDGKV 

CLSILNTWHGRPEEKWNPQTSSFLQVLVSVQSLI 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGNIRQ 

ATVKWAMLEQIR>[PSPCFKEVIHKHFYLKRVEIM 

AQCEEWIADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EETLMHDQVKPSSSKELPSDFQL 


3828 


A 


1415 


845 . 


PRVPATLVSLDFWHCFPTAGRLAGSTWVPPACT 

LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 

ACKPLFHIFCPLFACFMQEGKVQYLFLHLSHMRL 

LNYYFFPFLAPESLMQALEDLDYLAALDNDGNL 
SEFGIIMSEFPLDPQLSKSILASCEFDCVDEVLTIA 
AMVTGILNDYSFSFFANLH 


3829 


A 


199 


683 


VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 
LMVHVIEATELKACKPNGKSNPYCEISMGSQSYT 
TRTIQDTLNPKWNFNCQFFIKDLYQDVLCLTLFD 
RDQFSPDDFLGRTEIPVAKIRTEQESKGPMTRRLL 
LHEVPTGEVWVRFDLQLFEQKTLL 


3830 


A 


1747 


404 


RKMMEESGIETTPPGTPPPNPAGLAATAMSSTPV 

PLAATSSFSSPNVSSMESFPPLAYSTPQPPLPPVRP 

SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 

NPPVSHFPPSTSAPNTLLPAPPSGPPISGFSVGSTY 

DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 

ASLTSLAQGTGTTSAITFPEEQEDPRITRGQDEAS 

AGGIWGFIKGVAGNPMVKSVLDKTKHSVESMIT 

TLDPGMAPYIKSGGELDIVVTSNKEVKVAAVRD 

AFQEVFGLAVWGEAGQSNIAPQPVGYAAGLKG 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLWEDPVHGIHLETFTQATPVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TDWHMAFTGMSRRQMIYSAARAIAGMYKQRLP 

PRTV 


3831 


A 


5 


674 


FWmSAWHEGLQQMKANDPSLQEVNLYNIKNIP 

IPTLREFAKALET^r^WKKFSLAATRSNDPVAIAF 

ADMLKVNTTLTSLNIESHFITGTGILALVEALKEN 

DTLTEIKIDNQRQQLGTAVEMEIAQMLEENSRIL 

KFGYQFTKQGPRTRVAAAITKNNDLAWQKDTQ 

EQTSIWQVVSQSIAGFNPQFEVQGQNARSWMEE 

LGKAFHQFVRRELKQTEGKLP 


3832 


A 


164 


782 


EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL 
RDSGSDQDLDGAGVRASDLEDEESAARGPSQEE 
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SCQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
Ec^Glutamic Acid, F°Pbenylalanine, G^GIycine, H=Hlstidine, 
I=Isoleucine, K=l,ysine, L=jLeucine, M=Metbionine, 
N=Asparagine, P^ProIine, Q=>Glutamine, R^Arginine, S<=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X°:Unl(nown, *=Stop codon, /^possible nucleotide deletion, 
Npposdble nucleotide insertion 










EDNHSDEEDRASEPKSQDQDSEVNELSRGPTSSP 

CEEEGDEGEEDRTSDLRDEASSVTRELDEHELDY 

DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTT 

REEGKAGVQSVGEKESLEAAKEKKKEDDDGEID 

DEEMY 


3833 


A 


122 


1676 


SQPPHFTQKMNENKDTDSKKSEEYEDDFEKDLE 

WLINENEKSDASIIEMACEKEENINQDLKENETV 

MEHTKRHSDPDKSLQDEVSPRKNDnSVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRYIM 

EKIVQA>OaLQNQEPVM)KRERKLKFKDQLVDL 

EWPLEDTTTSKNYFENERNMFOKLSQLCISNDF 

GQEDVLLSLTNGSCEENKDRHLVERDGKFELLN 

lqdiasqgflppinnanstendpqqllprssnssv 

sgtkkedstakihavthsstgeplayiaqpplnr 

ktcpssavnsdrskgngksnhrtqsahispvtst 

yclsprqkelqkqleekreklkreeerrkieeek 

ekkrendivfkawlqkkreqvlemrriqrakei 

edmnsrqenrdpqqafrlwlkkkheeqmkerq 

teelrkqeeclfflkgtegkbrafkqwlrrkrm 

ekmaeqqavrertrqlrleakrskqlqhhlYm 

seakpfrftdhyn 


3834 


A 


575 


774 


rsrteelsnsgilkamskdlvtfgdvavnfsqee 
wewlnpaqrnlyrkvmlenyrslvslgkdmsp 


3833 


A 


2 


100 


asdfylryyvghkgkfgheflefefrpdgvyv 


3836 


A 


91 


749 


rptpghgdfwmqpltkdagmslssvtlasalqv 

rgealseeeiwsllflaaeqlledlrndssdyvv 

cpwsallsaagslsfqgrvsffleaapfkapellq 

gqsedeqpdasqmhvyslgmtlywsagfhvpp 

hqplqlceplhsilltmcedqphrrctlqsvlea 

crvhekevsvypapaglhikrlvglvlgtisevs 

repcfsssscwscvaiki 


3837 


A 


3 


1214 


SLGCmSARGKGQDDEVRTLMANGAPFTTDWFS 

KLRVSCGYIGDNCKNGADVNAKDMLKMTALH 

WATERHHRDVVELLIKYGADVHAFSKFDKSAFD 

lALEKNNAEBLVILQEAMQNQVNVNPERANPVTD 

PVSMAAPFIFTSGEWNLASLISSTNTKTTSGDPH 

ASTVQFSNSTTSVLATLAALAEASVPLSNSHRAT 

ANTEEIIEGNSVDSSIQQVMGSGGQRVITIVTDGV 

PLGNIQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETVIKBEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRLKLEAIARQQPNGVDFTMVEEVAEVDAVV 

VTEGELEERETKVTGSAGATGPPTRVSMATVSS 


3838 


A 


1 


1332 


MIEDNKENKDHSLERGRASLIFSLKNEVGGLIKA 

LKIFQEKHVNLLHIESRKSKRRNSEFEIFVDCDIN 

REQLNDIFHLLKSHTNVLSVNLPDNFTLKEDGME 

TVPWFPKKISDLDHCANRVLMYGSELDADHPGF 

KDNVYRKRRKYFADLAMNYKHGDPIPKVEFTEE 

EIKTWGTVFQELNKLYPTHACREYLKNLPLLSKY 

CGYREDNIPQLEDVSNFLKERTGFSIRPVAGYLSP 

RDFLSGLAFRVFHCTQYVRHSSDPFYTPEPDTCH 

ELLGHVPLLAEPSFAQFSQEIGLASLGASEEAVQ 

KLATCYFFTVEFGLCKQDGQLRVFGAGLLSSISE 

LKHALSGHAKVKPFDPKITCKQECLITTFQDVYF 

VSESFEDAKEKMREFTKTKRPFGVKYNPYTRSI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino ' 

acid residue oF 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D>>Aspartic Acid, 
EMSIutaniic Acid, F°Plienylalanine, G«Glycine, H^Histidine, 
I^Isolcudne, K°LysiDe, L^Leudne, MBMethioninc, 
N^Asparagine, F=Proline, Q^CIutamine, R=Afginin^ S^Scrine, 
T=Threonine, V=Valine, W^^Tryptoplian, Y^Tyrosinc, 
X=Unknown, *=Stop codon, A^possiUe nucleotide deletion, 
Vpossible nucleotide insertion 










QILKDTKSITSAMNELQHDLDWSDALAKVSRKP 
SI 


3839 


A 


3093 


520 


MVNFTVDQIRAIMDKKANIRNMSVIAHVDHGKS 

TLTDSLVCKAGIIASARAGETRFTDTRKDEQERCI 

HKSTAISLFYELSENDLNFIKQSKDGAGFLINLID 

SPGHVDFSSEVTAALRVTDGALWVDCVSGVCV 

QTETVLRQAIAERIKPVLMMNKMDRALLELQLE 

PEELYQTFQRIVENVNVnSTYGEGESGPMGNIMI 

DPVLGTVGFGSGLHGWAFI1.KQFAEMYVAKFA 

AKGEGQLGPAERAKKVEDMMKKLWGDRYFDP 

ANGKFSKSATSPEGKKLPRTFCQLILDPIFKVFDA 

IMNFKKEETAKLIEKLDIKLDSEDKDKEGKPLLK 

AVMRRWLPAGDALLQMITIHLPSPVTAQKYRCE 

LLYEGPPDDEAAMGIKSCDPKGPLMMYISKMVP 

TSDKGRFYAFGRVFSGLVSTGLKVRIMGPNYTPG 

KKEDLYLKPIQRTILMMGRYVEPIEDVPCGNIVG 

LVGVDQFLVKTGTITTFEHAHNMRVMKFSVSPV 

VRVAVEAKNPADLPKLVEGLKRLAKSDPMVQCI 

lEESGEHIIAGAGELHLEICLKDLEEDHACIPIKKS 

DPVVSYRETVSEESNVLCLSKSPNKHNRLYMKA 

RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 

EWDVAEARKIWCFGPDGTGPNILTDITKGVQYL 

NEIKDSWAGFQWATKEGALCEENMRGVRFDV 

HDVTLHADAIHRGGGQIIPTARRCLYASVLTAQP 

RLMEPIYLVEIQCPEQVVGGIYGVLNRKRGHVFE 

ESQVAGTPMFVVKAYLPVNESFGFTADLRSNTG 

GQAFPQCVFDHWQILPGDPFDNSSRPSQVVAETR 

KRKGLKEGIPALDNFLDKL 


3840 


A 


2 


753 


SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQG 

SSAMASKILLNVQEEVTCPICLELLTEPLSLDCGH 

SLCRACITVSNKEAVTSMGGKSSCPVCGISYSFE 

HLQANQHLANIVERLKEVKLSPDNGKKRDLCDH 

HGEKLLLFCKEDRKVICWLCERSQEHRGHHTVL 

TEEVFKECQEKLQAVLKRl^KEEEEAEKLEADIR 

EEKTSWKYQVQTERQRIQTEFDQLRSILNNEEQR 

ELQRLEEEEKKT 


3841 


A 


2 


405 


GKAFSCFTYLSQHRRTHMAEKPYECKTCKKAFS 

HFGNLKVHERIHTGEKPYECKECRKAFSWLTCL 

LRHERIHTGKKSYECQQCGKAFTRSRFLRGHEKT 

HTGEKMHECKECGKALSSLSSLHRHKRTHWRDT 

L 


'3842 


A 


311 


88 


AVLKNMAPMTALGLLDLHILNLILFLSAGEDFTS 
VVSEIMMYILLVFLTLWLLIEMIYCYRKVSKAEE 
AAQENA 


3843 


A 


3 


1175 


APIRNSRIDDFVRRVESKATSARCGLWGSGPRRR 

PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 

PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 

NFASAATKKITESVAETAQTIKKSVEEGKIDGIID 

KTIIGDFQKEQKKFVEEQHTKKSEAAVPPWVDT 

NDEETIQQQILALSADKRNFLRDPPAGVQFNFDF 

DQMYPVALVMLQEDELLSKMRFALVPKLVKEE 

VFWRNYFYRVSLIKQSAQLTALAAQQQAAGKEE 

KSNGREQDLPLAEAVRPKTPPVVIKSQLKTQEDE 

EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL 

VIJDKKQEETAVLffiDSADWEKELQQELQEYEV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
i>Glu(amic Acid, F'^Phenylalanine, G=Glyclne, H=Histidine, 
l^-Isoleucine, K^Lysine, L^Leudne, M<°Methioaine, 
^^Aspd^igine, P^Proline, Q=Glutamine, R-Arginine, S=Seriae, 
T=Tbreoiiine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, ''^top codon, /possible nudeotide deletion, 
V°possible nucleotide insertion 










VTESEKRDENWDKEIEKMLQEEN 


3844 


A 


798 


148 


LPPAQIPEAWLLLANWVVLDLVPLKDRLIDPLLL 

RCKLU>SALQKMALG^4FFGFTSVIVA0VLEMER 

LHYlHHNETVSQQIGEVLYNAAPLSIWWQffQYL 

UGISEIFASIPGLEFAYSEAPRSMQGAIMGIFFCLS 

GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 

MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 

ASHSRFSRDRG 


3845 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTILDIAGLDIPADMDGKSILKLLDTERP 

VNRFHLKKXMRVWRDSFLVERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKLHIDHEIETLQNKIKNLREVRGHLKKK 

RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 

QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


3846 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 
MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 
METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

lvkgksmpyefdirvpfyvrgpnveagclnpfflv 

lnidlaptildiagldipadmdgksilklldterp 

vnrfhlkkkmrvwrdsflvergkllhkrdndk 

vdaqeenflpkyqrvkdlcqraeyqtaceqlg 

qkwqcvedatgklklhkckgpmrlggsralsn 

lvpkyygqgseactcdsgdyklslagrrkklfk 

kkykasyvrsrsirsvaievdgrvyhvglgdaa 

qprnltkrhwpgapedqddkdggdfsgtgglp 

dysaanpikvthrcyilendtvqcdldlykslq 

awkdhklhidheietlqnkiknlrevrghlkkk 

rpeecdchkisyhtqhkgrlkhrgsslhpfrkgl 

qekdkvwllreqkrkkklrkllkrlqnndtcs 

mpgltcfthdnqhwqtapfwtlgpfcactsan 

nntywcmrtinethnflfcefatgfleyfdlnt • 

dpyqlmnavntldrdvlnqlhvqlmelrsckg 

ykqcnprtrnmdlglkdggsyeqyrqfqrrkw 

pemkrpsskslgqlwegweg 


3847 


A 


1 


1257 


mvfsavltafhtgtsnttfvvyentymnitlppp 
fqhpdlspllrysfetmaptglssltvnstavptt 
paafkslnlplqitlsaimifilfvsflgnlwclm 
vyqkaamrsainillaslafadmllavlnmpfa 
lvtilttrwifgkffcrvsamffwlfviegvaill 

nSIDRFLirVQRQDKLNPYRAKVLLWSWATSFCV 

afplavgnpdlqipsrapqcvfgyttnpgyqayv 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locatioD 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (^Alanine OCysteine, D=Aspartic Acid, 
EMSIutamic Add, F=Pbenylalanine, G^lycine, H=Histidine, 
INIsoleucine, K^Lysine, L=Leudne, M=Metlilonine, 
N=Asparagine, P=Proline, Q=Glulaniine, R^Arginine, S°&rine, 
T=-Thrconine, V-Valine, W=>Tryptophan, Y=Tyroslne, 
X=lInknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










ELISLISFFIPFLVILYSFMGILNTLRHNALRIHSYPE 

GICLSQASKLGLMGLQRPFQMSIDMGFKTRAFTT 

JLJLFAVFIVCWAPFTTYSLVATFSKHFYYQHNFF 

EISTWLLWLCYLKSALNPLIYYWRIKKFHDACLD 

MMPKSFKFLPQLPGHTKRRIRPSAVYVCGEHRT 

VV 


3848 


A 


3 


2827 


SSAVAARRRRSWASLVLAFLGVCLGITLAVDRS 

NFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGP 

DSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDE 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LTMAEGPYKIILTARPFRLDLLEDRSLLLSVNARG 

LLEFEHQRAPRVSQGSKDPAEGDGAQPEETPRD 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHVYGIPEHADNLRLKVTEG 

GEPYRLYNLDVFQYELYNPMALYGSVPVLLAHN 

PHRDLGIFWLNAAETWVDISSNTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVLEVDQGFDDHNLPCDVIWLDIEHADGKRYFT 

WDPSRFPQPRTMLERLASKRRKLVAIVDPHIKVD 

SGYRVHEELRNLGLYVKTRDGSDYEGWCWPGS 

AGYPDFTNPTMRAWWANMFSYDNYEGSAPNLF 

VWNDMNEPSVFNGPEVTMLKDAQHYGGWEHR 

DVHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLKISIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

AYQPFFRAHAHLDTGRREPWLLPSQHNDIIRDAL 

GQRYSLLPFWYTLLYQAHREGIPVMRPLWVQYP 

QDVTTFNIDDQYLLGDALLVHPVSDSGAHGVQV 

YLPGQGEVWYDIQSYQKHHGPQTLYLPVTLSSIP 

VFQRGGTIVPRWMRVRRSSECMKDDPITLFVALS 

PQGTAQGELFLDDGHTFNYQTRQEFLLRRFSFSG 

NTLVSSSADPEGHFETPIWIERVVIIGAGKPAAVV 

LQTKGSPESRLSFQHDPETSVLVLRKPGINVASD 

WSIHLR 


3849 


A 


1 


1717 


RARNARGCWGVCRSGFSSAVCGAARMEQVAEG 

ARVTAVPVSAADSTEELAEVEEGVGVVGEDNDA 

AARGAEAFGDSEEDGEDVFEVEKILDMKTEGGK 

VLYKVRWKGYTSDDDTWEPEIHLEDCKEVLLEF 

RKKIAENKAKAVRKDIQRLSLNNDIFEANSDSDQ 

QSETKEDTSPKKKKKKLRQREEKSPDDUCKKKA 

KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 

EELKESKKPKKDEVKETKELKKVKKGEIRDLKT 

KTREDPKENRKTKKEKFVESQVESESSVLNDSPF 

PEDDSEGLHSDSREEKQNTKSARERAGQDMGLE 

HGFEKPLDSAMSAEEDTDVRGRRKKKTPRKAED 

TRENRKLENKNAFLEKKTVPKKQRNQDRSKSAA 

ELEKLMPVSAQTPKGRRLSGEERGLWSTDSAEE 

DKETKRJflESKKPKKDEVKETKELKKVKKGEIRD 

LKTKTREDPKENRKTKKBKFVESQVESESSVLND 

SPFPEDDSEGLHSDSREEKQNTKSARERAGQDM 

GLEHGFEKPLDSAMSAEEDTDVRGRRPCKKTPRK 

AEDTRENRKLENKNAFLEKKTVPKKQRNQDRSK 

SAAELEKLMPVSAQTPKGRRLSGEERGLWSTDS 

AEEDBCETKRNESKKPKKDEVKETKELKKVKKGE 
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SEQID 
NO: 


Method 


Predicted 

beginniDg 

aucleolide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^^lanine 0°Cysteine, D=A$partic Acid, 
E^lutamic Acid, F=Phenylalanine, G=>Glycine, H-=Histidine, 
I=lsoleucine, K=Lysine, L^Lcudne, M=McthioDine, 
N^Asparagine, P=Proline, Q=Gtutamioe, R=Arginine, S=Serine, 
T=Tlireonine^ V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, 's^top codon, /^possible nucleotide ddetion, 
\-possiblc nudeotide insertion 










IRDLKTKTREDPKENRKTKKEKFVESQVESESSV 
LNDSPFPED/RQ*RATFRQQREEKSPDDLKKKKA 
KAGKLKDKSKPDLESSLBSLVFDLRTKKRISEAK 
EELKESKBCPK 


3850 


A 


1113 


3975 


PAAAAAAAAAAAAAAGRGPSFTPCFSPSLAVEPS 

RRTRLGSDPAQAMAGNVKKSSGAGGGSGSGGS 

GSGGLIGLMKDAFQPHHHHHHHLSPHPPGTVDK 

KMVEKCWKLMDKWRLCQNPKLALKNSPPYIL 

DLLPDTYQHLRTILSRYEGKMETLGENEYFRVF 

MENLMKKTKQTISLFKEGKERMYEENSQPRKNL 

TKLSUFSHMLAELKGIFPSGLFQGDTFRITKADA 

AEFWRKAFGEKTIVPWKSFRQALHEVHPISSGLE 

AMALKSTIDLTCNDYISVFEFDIFTRLFQPWSSLL 

RNWNSLAVTHPGYMAFLTYDEVKARLQKFIHKP 

GSYIFRLSCTRLGQWAIGYVTADGNILQTIPHNKP 

LFQALIDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIVVDPFDPRGSGSLLRQGAEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSSSDPWTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNSSPLVGPECDHPKI 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSQSSRACDCDQQIDSCTYEA 

MYNIQSQAPSITESSTFGEGNLAAAHANTGPEES 

ENEDDGYDVPKPPVPAVLARRTLSDISNASSS/FG 

LFVLERDP*PQNVTEGSQVPERPPKPFPRRINSER 

KAGSCQQGSGPAASAATA\SPQLSSEIENLMSQG 

YSYQDIQKALVIAQNNIEMAKNILREFVSISSPAH 

VAT 


3851 


A 


2 


2781 


GRVGSMDGAMGPRGLLLCMYLVSLLILQAMPA 

LGSATGRSKSSEKRQAVDTAVDGVFIRSLKVNC 

KVTSRFAHYVVTSQVVNTANEAREVAFDLEIPK 

TAFISDFAVTADGNAFIGDIKDKVTAWKQYRKA 

AISGENAGLVRASGRTMEQFTIHLTVNPQSKVTF 

QLTYEEVLKIWHMQYEIVIKVKPKQLVHHFEIDV 

DIFEPQGISKLDAQASFLPKELAAQTIKKSFSGKK 

GHVLFRPTVSQQQSCPTCSTSLLNGHFKVTYDVS 

RDKICDLLVANlfflFAHFFAPQNLTNMNKNVVFV 

IDISGSMRGQKVKQTKEALLKILGDMQPGDYFD 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SLDEATNLNGGLLRGffilLNQVQESLPELSNHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEVMSMENNGRAQRIYEDHD 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIWAGRIADNKQSSFKADVQA 

HGEGQEFSITCLVDEEEMKKLLRERGHMLENHV 

ERLWAYLTIQELLAKRMKVDREVRANLSSQALR 

MSLDYGFVTPLTSMSIRGMADQDGLKPTIDKPSE 

DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 

RVTGVDTDPHFIIHVPQKEDTLCFNINEEPGVILS 

LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR 
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PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

begiDning 

oucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C-Cysteine, D^Aspartic Acid, 
&=Glutamic Acid, F=PhenyIaIanine, G^Glycine, HeHistidine, 
Msoleuclne, K-Lysine, L=Leucinet M=Methionine, 
N-Asparaginc, P^Proline, Q=Glutainine, R^Arginine, 5=Serine, 
T=Threonine, V=Valinc, W«Tryptophan, Y=Tyrosine, 
X-Unknown, *^top codon, possible nucleotide deletion, 
\=possible nucleotide insertion 










LGIANPATDFQLEVTPQNITLNPGFGGPVFSWRD 

QAVLRQDGVWTINKKRNLVVSVDDGGTF\EVV\ 

LHRVWVKGSSWHQDFLGLLMCWDKSIGMSSPGR 

KGCWGQ\FFHPIRFLKVS*HPPPGSDPQKAQMPT 

MVVRNPPGLTVT\RGLQKDYSKDPWHGAEVSC 

WFI\HNNGA*I\TDCAYTDYI\VPDIF 


3852 


A 


39 


1735 


TQVAEAGRGEGVVAGAETGRPQSAGMNLELLES 

FGQNYPEEADGTLDCISMALTCTFNRWGTLLAV 

GCNDGRIVIW\DF\LTRGIA*NKFSAfflHPVCSLC 

WSRDGHKLVSASTDNIVSQWDVLSGDCDQRFRF 

PSPILKVQYHPRDQNKVLVCPMKSAPVMLTLSD 

SKHVVLPVDDDSDLNVVASFDRRGEYIYTGNAK 

GKILVLKTDSQDLVASFRVTTGTSNTTAIKSIEFA 

RKGSCFLINTADRnRVYDGREILTCGRDGEPEPM 

QKLQDLVNRTPWKKCCFSGDGEYIVAGSARQH 

ALYIWEKSIGNLVKILHGTRGELLLDVAWHPVRP 

IIASISSGWSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDIEDEDKSEPEQTGADAAEDEEVD 

VTSVDPIAAFCSSDEELEDSKALLYLPIAPEVEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKKKPKTTNIELQGVPNDEVHPLLGVKGDG 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 


3853 


A 


45 


2603 


PLLFTCGREVRARDPEKEGTIVVAGLKVQVQPRF 

LWILCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEEWECLDAVQRDLYKDVMLENYSNLVS 

LDLEYKYITKNLLSEKNVCKIYLSQLQTGEKSKN 

TIHEDTIFRNGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKIHAREKSYECKECRKAFRQQSYLI 

QHLRIHTGERPYKCMECGKAFCRVGDLRVHHTI 

HAGERPYECKECGKAFRLHYHLTEHQRIHSGVK 

PYECKECGKAFSRVRDLRVHQTIHAGERPYECK 

ECGKAFRLHYQLTEHQRIHTGERPYECKVCGKT 

FRVQRHISQHQKIHTGVKPYKCNECGKAFSHGS 

YLVQHQKIHTGEKPYECKECGKSFSFHAELARH 

RRIHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAFICGYQLTLHLRTHTGEIPYEC 

KECGKTFSSRYHLTQHYRIHTGEKPYICNECGKA 

FRLQGELTRHHRIHTCEKPYECKECGKAFIHSNQ 

FISHQRIHTSESTYICKECGKIFSRRYNLTQHFKIH 

TGEKPYICNECGKAFRFQTELTQHHRIHTGEKPY 

KCTECGKAFIRSTHLTQHHRIHTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

FRLHTGEKPYSCKECGNAFRLQAELTRHHIVHTG 

EKPYKCKECGKAFSVNSELTRHHRIHTGEKPYQC 

KECGKAFIRSDQLTLHQ\KIILVR\NPMHNVKRIR 

WPLENAL*QRICNLKNFLFVTEHVGIPFTSCSQF1 

RNYFVC 


3854 


A 


108 


894 


LQSCWVPGBPWPSVGWLSWLKDLPSCEIHSASLS 

AVLQGPQCSEMLWPKNLTSWDDSSSVSSGISDTI 

DNLSTDDINTSSSISSYANTPASSRKNLDVQTDAE 

KHSQVERNSLWSGDDVKKSDGGSDSGIKMEPGS 

KWRRNPSDVSDESDKSTSGKKNPVISQTGSWRR 

GMTAQVGITMPRTKASAPAGALKTPGTGKRPGL 
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wo 01/57190 



PCT/USOl/04098 



SEQW 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
Msoleucine, K=>JLysine, l^Lcucine, IVl^Methioniae, 
N=AsparagiDe, P=Proline, Q^K^Iulamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyro$ine, 
X=llnkno>vn, *=^topcodon,/=possible nucleotide deletion, 
V=pos5ible nucleotide imertion 










S\GPGAPTPAAPPQLARMAWAFSLSAASTPAVSP 
STSPSAVEGSPATILPLASSPPPRTTP*LPLSELTV* 
RPQELVRGRGCXGPGAPTPAAFPQLARMAWAFS 
LSAASTPAVSPSTSPSAVEGSPATILPLASSPPPRT 

TP 


3855 


A 


1 


772 


FRGGDGAPGVLKPGNPLPFPLPPLQYPPPSTLSHS 

DNLAMTSRSTARPNGQPQASKICQFKLVLLGESA 

VGKSSLVLRFVKGQFHEYQESTIGAAFLTQSVCL 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

VVyDITNQETFARAKTWVKELQRQASP\SIWGL 

AGNKADLANKRMVEYEEAQAYADDNSLLFMET 

SAKTAMNVNDLFL\AIA*EVAKRVNPQNLG\G\A 

AGRSRGVDLHEQS\QQNKSQCCSN 


3856 


A 


2815 


352 


LGLEAAARPRPGGPAAMQDGNFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSRRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFHERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGLDWPEATEVSPSRTIRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAYIQHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

MRNLIYDNADNKLALVEENGIFELLRTLREQDDE 

LRKNVTGIL\VNLSSSDHLKDRLAKKTPLE\QLT\D 

LGV*APLSGAGGPP\LIQQNASEAEIFYNATGFPR 

NLSSASQATRQKMRECHGLVDALVTSINHALDA 

GKCEDKSVENAVCVLRNLSYRLYDEMPPSALQR 

LEGRGRRDLAGAPPGEVVGCFTPQSRRLRELPLA 

ADALTFAEVSKDPKGUEWLWSPQIVGLYNRLLQ 

RCELNRHTTEAAAGALQN1TGG\DPRGPGGLSRL 

ALEQERILNPLLDRVRTADHHQLRSLTGLIRNLS 

RNAKNKDEMSTKVV\SHLI\EKLPGSVGEKSPPAE 

VLV\NI\IAVFNNLGWLASPyALARDLLYFDGLRK 

LIFIKKKRDSPDSEKSSRAASSLLANLWQYNKLH 

RDFRAKGYRKEDFLGP 


3857 


A 


1034 


204 


VAVTLLSQLPSAIQRTAAWEMRAPLTFRVPLALD 

LKPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

QQKKKTBCDLGFRAGKESKTEWRK*GLQDMASQ 

MFALPLK*PVTAAFHDSSMPSSLLQffiMEQLFLE 

ARLQ/PDSKSEARRNQCDSMLLRNQQLCSTCQE 

MKMVQPRTMKIPDDPKASFENCMSYRMSLHQP 

KFQTTPBPFHDDIPTENIHLQNUPILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 


3858 


A 


203 


3469 


SHQEIEQNSAMAPRKRGGRGISFffCCFRNNDHPE 

ITYRLRNDSNFALQTMEPALPMPPVEELDVMFSE 

LVDELDLTDKHREAMFALPAEKKWQIYCSKKK 

DQEENKGATSWPEFYIDQLNSMAARKSLLALEK 

EEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLS 

CILhnTLKTMDYETSESRIHTSLIGCIKALMNNSQG 

RAHVLAHSESINVIAQSLSTENIKTKVAVLEILGA 

VCLVPGGHKKVLQAMLHYQKYASERTRFQTLIN 
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wo 01/57190 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine CK^ysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=PhenylaIaninc, G~Glycine, H-Histidine, 
Msoleucine, K=Lysine, L=L€ucinc, M=Methionine, 
N=Asparagine, P«Proline, Q^Glutamine, R=Arginine, S=Serine, 
T==Threonine, V^Vaiine, W=Tryptophan, Y=Tyrosine, 
X=llnknown, *=Stop codon» /=possible nucleotide deletion, 
\=possible nucleotide insertion 










DLDKSTGRYRDEVSLKTADVtSFINAVLSQGAGVE 

SLDFRLHLRYE\FLMLGIHPVMDKLRKHENSTLD 

RmDFFEMLRNEDELEFAKRFELVfflDTKSATQM 

FELTRKRLTHSEAYPHFMSDLHHCLQMPYKRSGN 

TVQYWLLLDRnQQIVIQNDKGQDPDSTPLENFNI 

KNWRMLVNENEVKQWKEQAEKMRKEHNELQ 

QKLEKKERECDAKTQEKEEMMQTLNKMKEKLE 

KETTEHKQVKQQVADLTAQLHELSRRAVCASDP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSIPQPTNALKSFNWSKLPENKLEGTVWTEIDD 

TKVFKILDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDTLSSKLKVKELSVIDGRRAQNCMLLS 

RLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFV 

PEKSDroLLEEHKHELDRMAKADRFLFEMSRINH 

YQQRLQSLYFKKKFAERVAEVKPKVEAIRSGSEE 

VFRSGALKQLLEVVLAFGNYMNKGQRGNAYGF 

KISSLNKIADTKSSIDKNITLLHYLITIVENKYPSV 

LNLNEELRDIPQAAKVNMTELDKEISTLRSGLKA 

VETELEYQKSQPPQPGDKFVSVVSQFITVASFSFS 

DVEDLLAEAKDLFTKAVKHFGEEAGKIQPDEFF 

GIFDQFLQAVSEAKQENENMRKKKEEEERRARM 

EAQLKEQRERERKMRKAKENSEESGEFDDLVSA 

LRSGEVFDKDLSKLKRNRKRITNQMTDSSRERPI 

TKLNF 


3859 


A 


1279 


141 


RVEHLSEFLVDIKPSLTFDVIPLLDPYGPAGSDPS 

LEFLVVSEETYRGGMAINRFRLENDLEELALYQI 

QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRPPY 

ERPELPTCLYVIGLTGISGSGKSSIAQRLKGLGAF 

VroSDHLGHRAYAPGGPAYQPVVEAFGTDILHK 

DGIINRKVLGSRWGNKKQLKILTDIMWPIIAKLA 

REEMDRAVAEGKRVCVIDAAVLLEAGWQNLVH 

EVWTAVIPETEAVRRIVERDGLSEAAAQSRLQSQ 

MSGQQLVEQSHVVLST\CGSRISPNARWRKPGPS 

CRSAFPRLIRPSTEKFSVGPDWLLELTSDPVVRRN 

GGLDAHPGSGPEVQAILCRTWPGLVDTGSLPNTL 

VFGQH 


3860 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEDLVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDVVEALSEEHMEADGHAAVVFGTWDIISRS 

GEKIPVSVWMKRMRQERRLCCWVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRJNVVLA 

GGHWPRDEIRKLMESQDIFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to lirst amino 

acid residue of ' 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine C»Cysteine, D~Aspartic Acid, 
E»Glutamic Acid, F=PhenylaIanine, G==Glycine, H=Histtdine, 
I^Isoleucine, K==Lysine, L^Leudne, M^^Metbiooine, 
N^Asparagine, P»Proline, Q^Jutamiae, R-Arginine, S^Serine, 
T-Threonine, V-Valine, W-Tryptophan, Y=Tyrosine, 
X»Unhnown, *<=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVWKFIKKEKVLEDCWIEDPKLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKDIIHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 

RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3861 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAWFGTVVDnSRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQfflPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINVVLA 

GGHWPRDEIRKLMESQDIFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAffiSPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVWKFIKKEKVLEDCWEEDPKLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLmiHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTBEYCAPEVLMGNPY 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of ■ 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutaniic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, IVI^Methionine, 
N^Asparagine, P=Proline, Q^KjIutamine, R=Argioine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3862 


A 


399 


2069 


TMDRSKKNSIAGFPPRVEVRLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENijF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3863 


A 


399 


2069 


TMDRSKRNSIAGFPPRVEVRLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEEPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3864 


A 


3 


911 


SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQ 

RAKPSNFLLDRKKTDKLKKKKKRKRRDSDAPGK 

EGYRGGLLKLEAADPYVETPTSPTLQDIPQAPSD 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

VR/QEAQLMARNDGNFSSLLESIFPS\DDDSWDLV 

TCFCMKPFAGRPMIECNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD 


3865 . 


A 


3 


3573 


QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 

FLTIARRRGRRSMPVSLEDSGEPTSCPATDAETAS 

EGSVESASETRSGPQSASTAVKERPASSEKVKGG 

DDHDDTSDSDSDGLTLKELQNRLRRKREQEPTE 

RPLKGIQSRLRKKRREEGPAETVGSEASDTVEGV 

LPSKQEPENDQGWSQAGKDDRESKLEGKAAQD 

IKDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalaninc, G=Glycine, H==Histidine, 
Msoleucine, K«=Lysine, L=Leucinc, M=Methionine, 
N=Asparagine, P==Pro]ifle, (^Glutamine, R^Arginine, S=Serine, 
T=Threonine, V«Valinc, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *«Stop codon,A»possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 

DYICPNCTILQVQDETHSETADQQEAKWRPGDA 

DGTDCTSIGTffiQKSSEDQGIKGRIEKAANPSGKK " 

KLKIFQPGPGPVPTQLPVLWQVLEIAVSRSISAFT 

LLHCISCKVIEAPGASKCIGPGCCHVAQPDSVYCS 

NDCILKHAAATMKFLSSGKEQKPKPKEKMKMK 

PEKPSLPKCGAQAGIKISSVHKRPAPEKKETTVK 

KAVVVPARSEALGKEAACESSTPSWASDHNYNA 

VKPEKTAAPSPSLLYKSTKEDRRSEEKAAATAAS 

KKTAPPGSTVGKQPAPRNLVPKKSSFANVAAAT 

PAIKKPPSGFKGTIPKRPWLSATPSSGASAARQAG 

PAPAAATAASKKFPGSAALVGAVRKPVVPSVPM 

ASPAPGRLGAMSAAPSQPNSQIRQNIRRSLKEIL 

WK/RFLFFILFRVNDSDDLIMTENEVGKIALHIEK 

EMFNLFQVTDN/RAYKSKYRSB^IFNLKDPKNQG 

LFHRVLREEISLAKLVRLKPEELVSKELSTWKER 

PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 

VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 

SQHRAHLFDLNCKICTGQVPSAEDEPAPKKQKLS 

ASVKKEDLKSKHDSSAPDPAPDSADEVMPEAVP 

EVASEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 

EDLSPCPASCGSGVVTTVTVSGRDPRTAPSSSCT 

AVASAASRPDSTHMVEARQDVPKFVLTSVMVPK 

SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 

LFLSRLSTIWKGFINMQSVAKFVTKAYPVSGCFD 

YLSEDLPDTIHIGGRIAPKTVWDYVGKLKSSVSK 

ELCLIRFHPATEEEEVAYISLYSYFSSRGRFGVVA 

NNNRHVKDLYLIPLSAQDPVPSKLLPFEGPGKRR 

LSGWR 


3866 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

G\LPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

VVIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKEIQLMHRAPWGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLWSEEQFKVFTLPKVSAK 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine C^Cysteine, D^^Aspartic Acid, 
E^Glntamic Acid, F»Phenylalanine, G-Glycine, H^'Histidine, 
I^Isoleudne, K^Lysine, I^Leudne, M=Methionine, 
N==Asparagine, P^Proline, Q^GIutamine, R-Arginine, SsSerine, 
T=Threonine, V=Valine, W»Tryptophan, Y^Tyrosine, 
X-Unknown, *^top codon,^^possible nucleotide deletion, 
\=possible nucleotide insertion 










LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 

LAVLTNLGDIQVVSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKGVLVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 


3867 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

GXLPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALVVLAEEEL 

VVIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

PLKLWERHAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKEIQLMHRAPVVGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 

LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 

LAVLTNLGDIQVVSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKGXLVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 


3868 


A 


1 


2497 


GDSGGPLVCEEPSGRPFLAGIVSWGIGCAEARRP 

GVYARVTRLRDWILEATTKASMPLAPTMAPAPA 

APSTAWPTSPESPVVSTPTKSMQALSTVPLDWVT 

VPKLQECGARPAMEKPTRVVGGFGAASGEVPW 

QVSLKEGSRHFCGATVVGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVKIGLRRWLHP 

LYNPGILDFDLAVLELASPLAFNKYIQPVCLPLAI 

QKFPVGRKCMISGWGNTQEGNATKPELLQKASV 

GIIDQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQIEIG 

KUIAELDEVNKSAKKREGELTVAQGRVKDLESL 

FHRSEVELAAALSDKRGLESDVAELRAQLAKAE 

DGHAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKSVFEEEVRETRRRHERRLVEVDSSRQQEYDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAKLDS 

AKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGLQKQASAAEDRIRELEEAMAGERDKFRKMLD 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C»Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanlne, G=Glycine, H=Histidine, 
l=Isoleudne, K=Lysine, Lr=Leucine, M=4Vletliionine, 
N^Asparagine, ps=Proline, Q'=Glutamine, R=Arginme, S»Serine, 
T»Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *»5top codon, /^possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










AKEQEMTEMRDVMQQQLAEYQELLDVKLALD 

MEINAYRKLLEGEEERLKLSPSPSSRVTVSRATSS 

SSGSLSATGRLGRSKRKR\WRWRSPW\QRPKRPG 

HGHGWQRWLPPGPAGLGLGQRXHIEEIDLEGKFV 

QLKNNSDKDQSLGNWRIKRQVLEGEEIAYKFTP 

KYILRAGQMVTVWAAGAGVAHSPPSTLVWKGQ 

SSWGTGESFRTVLVNADGEEVAMRTVKKSSVM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 


3869 


A 


1 


1942 


RYRAGIPGDGRKDYIRLTRPGLTLPGRAMFARGS 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLERIMCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKVPYVNSPGEKYRIKQLLHQLPPHDS 

EAQYCTAL\EE\EEKKELRAFSQQRKRENLG/RLG 

lVRIFPVn'nGAI\CEECGKQIGGGDIAVF\ASRASL 

GLLLGQPSCFWCTTCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRGAPHRHSMPEL 

GLRSVPEPPPESPGQPNLRPDDSAFGRQSTPRVSF 

RDPLVSEGGPRRTLSAPPAQRRRPRSPPPRAPSRR 

RHHHHNHHHHHNRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

IVA 


3870 


A 


2 


3485 


FVWRVFYVHASCMPPRARSWEGAHAPVGMHV 

AEAHACSSQQQQMPPAQFWMLEWLLHLCAFLS 

TPSFPHWCCCSNPHGSIADKPEEIVPASKPSRAAE 

NMAVEPRVATIKQRPSSRCFPAGSDMNSVYERQ 

GIAVMTPTVPGSPKAPFLGIPRGTMRRQKSIDSRI 

FLSGITEEERQFLAPPMLKFTRSLSMPDTSEDIPPP 

PQSVPPSPPPPSPTTYNCPKSPTPRVYGTIKPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKRGQMPENPYSE 

VGKIASKAVYVPAKPARRKGMLVKQSNVEDSPE 

KTCSIPIPTIIVKEPSTSSSGKSSQGSSMEIDPQAPE 

PPSQLRPDESLTVSSPFAAAIAGAVRDREKRLEA 

RRNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDFADEDSAEQLSSPMPSATPREPENHFVGGAEA 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLIDIMDTSQQKSAGLLMVHTVDATKLDNA 

LQEEDEKAEVEMKPDSSPSEVPEGVSETEGALQI 

SAAPEPTTVPGRTIVAVGSMEEAVILPFRIPPPPLA 

SVDLDEDFIFTEPLPPPLEFANSFDIPDDRAASVPA 

LSDLVKQKKSDTPQSPSLNSSQPTT^SADSKKPAS 

LSNCLPASFLPPPESFDAVADSGIEEVDSRSSSDH 

HLETTSTISTVSSISTLSSEGGENVDTCTVYADGQ 

AFMVDKPPVPPKPKMKPIIHKSNALYQDALVEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D°>Aspartic Acid, 
E<^lntan)ic Acid, F=Pbenylalanine, G^Iycine, H-HistidiDC, 
I^lsolcucine, K=Lysine, Iy=Leucioe, IVl^Methionioe, 
N=Asparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Tlireonlne, V=Vanne, W=Tryptophan, Y-Tyrosine, 
X-^Unknown, ^''Stop codon, /^possible nucleotide delciion, 
\=possible nucleotide insertion 










DVDSFVEPPPAPPPPPGSAQPGMAKVLQPRTSKL 

WGDVTEIKSPILSGPKANVISELNSILQQMNREKL 

AKPGEGLDSPMGAKSASLAPRSPEIMSTISGTRST 

TVTFTVRPGTSQPITLQSRPPDYESRTSGTRRAPS 

PWSPTEMNKETLPAPLSAATASPSPALSDVFSLP 

SQPPSGDLFGLNPAGRSRSPSPSILQQPISNKPFTT 

KPVHLWTKPDVADWLESLNLGEHKEAFMDNEI 

DGSHLPNLQKEDLIDLGVTRVGHRMNIERALKQ 

LLDR 


3871 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHBDGMGRNLADRCTD 

EVNALVLQTQQEIIENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPDNASQEELMITLVTGLASVTSRTSMGIirV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATBKLRMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 

QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 

NEES 


3872 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPWDVLKIYKSELNKHIEI3GMGRNLADRCTD 

EVNALVLQTQQEIffiNLKPLLPAGlQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPDNASQEELMITLVTGLASVTSRTSMGIIIV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 

QLEKIQNNSKLLRNKAVQLH«IELENFTKQFLPSS 

NEES 


3873 


A 


2944 


2089 


PVCTALTPGRMTDDKDVLRDVWFGRIPTCFTLY 

QDEITEREAEPYYLLLPRVSYLTLVTDKVKKHFQ 

KVMRQEDISEIWFEYEGTPLKWHYPIGLLFDLLA 

SSSALPWNTTVHFKSFPEKDLLHCPSKDAIEAHF 

MSCMKEADALKHKSQVINEMQKKDHKQLWMG 

LQNDRFDQFWAINRKLMEYPAEENGFRYIPFRIY 

QTTTERPFIQKLFRPVAADGQLHTLGDLLKEVCP 

S AIDPEDGEKKNQVMfflGIEPMLETPLQWLSEHL 

SYPDNFLfflSnPQPTT) 


3874 


A 


776 


366 


QARGAPSSPMCPLPLAAAAVAAPRAPLRLLNRG 

LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 

DEARKIGWGWVKNTSKGTVTGQVQGPEDKVN 

SMKSWLSKVGSPSSRmRTNFSNEKTISKLEYSNF 

SIRY 


3875 


A 


1081 


182 


slsscqtdprpmsapldaalhalqeeqarlkmr 

lwdlqqlrkelgdspkdkVpfsvpboplvfrght 

qqdpevpkslvsnlrihcpllagsalitfddpkva 

eqvlqqkehtinmeecrlrvqvqplelpmvttiq 

vmvssqlsgrrvlvtgfpaslrlseeelldkleif 

fgktrngggdvdvrellpgsvmlgfardgvaq 

rlcqigqftvplggqqvplrvspyvngeiqkaei 

rsqpvprsvlvlnipdildgpelhdvleihfqkpt 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino * 

acid residue of ' 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Aianine C-Cysteine, D^Aspartic Acid, 
EF=Glutamic Acid, F^Phenylalanine, G'^GIycine, H==Histidine, 
1-Isoleucine, K^'Lysine, L^Leucine* M^Methionine, 
N»Asparagine, P=Proiine, Q=»Glutamine, R»Arginine, &:^ine, 
T^Threonine, V=Valine, W=Tryptophan, Y«Tyrosinc, 
A— unKnOfVnf =^iop vouon, /^posaiuis uuucuuuc ucicuun, 
\=spossible nucleotide insertion 










RGGGEVEALTVVPQGQQGLAVFTSESG 


3876 


A 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKWQSPLSL 
WHEGDTVTLNCSYEVTNFRSLLWYKQEKKAPT 
FLFMLTSSGIEKKSGRLSSILDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 


3877 


A 


3. 


1291 


KAFRLLAERGAAAAMLWSGCRRFGARLGCLPG 

GLRVLVQTGHRSLTSCIDPSMGLNEEQKEFQKV 

AFDFAAREMAPNMAEWDQKELFPVDVMRKAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISIHNMCAWMIDSFGNEEQRHKFCPPLC 

TMEKFASYCLTEPGSGSDAASLLTSAKKQGDHYI 

LNGSKAFISGAGESDIYVVMCRTGGPGPKGISCIV 

VEKGTPGLSFGKKEKKVGWNSQPTRAVIFEDCA 

VPVANRIGSEGQGFLIAVRGLNGGRINIASCSLGA 

AHASVILTRDHLNVRKQFGEPLASNQYLQFTLA 

DMATRLVAARLMVRNAAVALQEERKDAVALCS 

MAKLFATDECFAICNQALQMHGGYGYLKDYAV 

QQYVRDSRVHQILEGSNEVMRILISRSLLQE 


3878 


A 


10 


1014 


LPGSTISSSGCQAPGRADSSGGARNSRRGDSRPG 

SCNRQAVAPPCPSPGPQSRHWIHRGTAPQAGETR 

TLGRGSSAPNACSASVTPCCPSSPPS*SCL*PTRRS 

PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 

TQGCSKLLGKQTTHLPCSTWPA**PSPSCLTRFR* 

W*PSLMCLWASSCSVCV*SPSGSCRH*LWGTHST 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 

QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 

GSWATGSPRLTQWKSSRLTSTSHSARSAWKPSA 

TESTPSWPRFSSWTSGEDPASPAPAI 


3879 


A 


200 


699 


LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSAPG 

NTSLCTRDYKITQVLFPLLYTVLFFVGLITNGLA 

MRIFFQIRSKSNFIIFLKNTVISDLLMILTFPFKILS 

DAKLGTGPLRTFVCQVTSVIFYFTMYISISFLGLIT 

IDRYQKTTRPFKTSNPKNLLGAKILK 


3880 


A 


26 


169 


QPETDTMVHLTPEEKSAVTALWGKVNVDEDAG 
DDLCQILVDRPRLRI 


3881 


A 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPPAC 

RIMPTTVDDVLEHGGEFHFFQKQMFFLLALLSAT 

FAPIYVGIVFLGFTPDHRCRSPGVAELSLRCGWSP 

AEELNYTVPGPGPAGEASPRQCRRYEVDWNQST 

FDCVDPLASLDTNRSRLPLGPCRDGWVYETPGSS 

IVTEFNLVCANSWMLDLFQSSVNVGFFIGSMSIG 

YUDRFGRKLCLLTTVLINAAAGVLMAISPTYTW 

MLIFRLIQGLVSKAGWLIGYELITEFVGRRYRRTV 

GIFYQVAYTVGLLVLAGVAYALPHWRWLQFTV 

ALPNFFFLLYYWCIPESPRWLISQNKNAEAMRDK 

HIAKKNGKSLPASL 


3882 


A 


573 


1620 


KSKCRFPEGLSEGFGPMRKEALSSGSVQEAEAM 

LDEPQEQAEGSLTVYVISEHSSLLPQDMMSYIGP 

KRTAVVRGIMHREAFNnGRRIVQVAQAMSLTED 

VLAAALADHLPEDKWSAEKRRPLKSSLGYEITFS 

LLNPDPKSHDVYWDIEGAVRRYVQPFLNALGAA 

GNFSVDSQILYYAMLGVNPRFDSASSSYYLDMH 

SLPHVINPVESRLGSSAASLYPVLNFLLYVPELAH 

SPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDS 

KTYNASVLPVRVEVDMVRVMEVFLAQLRLLFGI 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F-Phenylalanine, G=^GIycine, H'^'Histidine, 
l=]soIeucine, K=Lysine, L^Leucine, IVI'^Methionine, 
N=Asparagine, P«ProUne, Q=GIutamine, R^'Arginine, S»Serine, 
T=Threonine, V=Va!jnc, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










AQPQLPPKCLLSGPTSEGlJydTWELDRLLWARSV 
ENLATATTTLTSLA 


3883 


A 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTKKIQ 

FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 

Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 

LFFVLKSSDVLDILVPILFFLNDARADQSRVGLM 

HIGVFBLLLLSGECNFGVRLNKPYSIRVPMDIPVF 

TGTHADLLIVWFHKIITSGHQRLQPLFDCLLTIW 

NVSPYLKSLSMVTANKLLHLLEAFSTTWFLFSAA 

QNHHLVFFLLEVFNNIIQYQFDGNSNLVYAIIRKR 

SIFHQLANLPTDPPTIHKALQRRRRTPEPLSRTGS 

QGGAPPWRAPAPLPLQSQAPSRPVWWLLQALTS 

♦PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILR 

FLQHGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 


3884 


A 


1 


804 


NGPRAPFSQEGQSTGPPPLIPRLGQHGAQGRIPPL 

NPGQGPGPNKDDSRGPPNHHMGPMSERRHEQSG 

GPEHGPERGPLRGGQDCRGPPDRRGPHPDFPDDF 

SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 

RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 

RRPGG*TFPPGSRGPTFS/SGAEEESFRRGAPPRHE 

GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 

LRGRGRGTPRGERVTKDTWSGRIGCRIHWL 


3885 


A 


3 


996 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3886 


A 


773 


317 


QCTQKAAEGYTQFYYVDVLDGKLACVNKCTKG 
TKSQMNCNLGTCQLQRSGPRCLCPNTNTHWYW 
GETCEFNIAKSLVYGIVGAVMAVLLLALIILIILFS 
LSQ\RKRHRPESEGEADFGLENATNNFG\PTLETV 
DSGTELHIQ\RPEMVASTV 


3887 


A 


3 


466 


VDFRVKTLLVDNKCFVLQLWDTAGQERYHSMT 

RQLLRKADGVVLMYDITSQESFAHVRYWLDCL 

QDAGSDGVVILLLGNKMDCEEERQVSVEAGQQL 

AQELGVYFGECSAALGHNILEPWNLARSLRMQ 

EEGLKDSLVKVAPKRPPKRFGCCS 


3888 


A 


3412 


3144 


QNIDITNFSSSWNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRRNFMLAFQAAESVGKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFET 


3889 


A 


1 


1160 


LVVTAITAILAFPNEYTRMSTSELISELFNDCGLL 

DSSKLCDYENRFNTSKGGELPDRPAGVGVYSAM 

WQLALTLILKIVITIFTFGMKIPSGLFIPSMAVGAI 

AGRLLGVGMEQLAYYHQEWTVFNSWCSQGAD 

CITPGLYAMVGAAACLGGVTRMTVSLWIMFEL 

TGGLEYIVPLMAAAMTSKWVADALGREGIYDA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E^lutamic Acid, F-Phenylalanine, G«<?lycine, H^Histidine, 
I-Isoleucine, K=Lysine, L=L€ucine, M-Metbionine, 
N=Asparagine»P»Proline, 0=Glutaininc, R=Arginine, S=Serine, 
T^Threonlne, V^Valinc, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, **=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










HIRLNGYPFLEAKEEFAHKTLAMDVMKPRRNDP 

LLTVLTQDSMTVEDVETIISETTYSGFPVVVSRES 

QRLVGFVLRRDLIISIENARKKQDGWSTSIIYFTE 

HSPPLPPYTPPTLKLRNILDLSPFIVTDLTPMEIVV 

DIFRKLGLRQCLVTHNGRLLGnTKKDVLKHlAQ 

MANQDPDSILFN 


3890 


A 


1 


387 


SWCWTGIFVLGTimRLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
RWYTKLQLKELENEYAINKFINKDKRRRISAAT 
NLSERQVTIWFQNRRVKDKKIVSKLKDTVS 


3891 


A 


2 


2914 


RGGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVIIIPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPTVVRAAELEQVPHIALFLFK 

KTRLSITICFFSKFLLPYCGLDTLADQNXNQVRKT 

SQAALL\ALLEQELIERFDVETKVCPVLIELTAPDS 

NDDVKTEAVAIMCKMAP\MVGKDITERLILPRFC 

EMCCDCRA4FH\VRK\VCAANFGDICSWGQQAT 

EEMLLPRFFQLCSDNVWGVRKACAECFMAVSC 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFISTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVILENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NENDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE 

SEGPVPSSPNITMATRKELEEMIENLEPHIDDPDV 

KAQVEVLSAALRASSLDAHEETISIEKRSDLQDE 

LDINELPNCKINQEDSVPLISDAVENMDSTLHYIH 

NDSDLSNNSSFSPDEERRTKVQDVVPQALLDQY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR 

QNWHCLRETYETLASDMQWKVRRTLAFSIHELA 

VILGD\QLTAADLVPIFNGFLK*PSMKSRIGVLKH 

LHDFLKLLHIDKRREYLYQLQEFLVTDNSRNWR 

FRAELAEQLELLLELYSPRDVYDYLRPIALNLCAD 

KVSSVRWISYKLVSEMVKKLHAATPPTFGVDLIN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHLLTLANDRVPNVRVLLAKTLRQT 

LLEKDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASIHPASTKISEDAMSTASSTY 


3892 


A 


158 


2191 


VPLPAPSGLSGGGSRGAGCKXAPPGRAPAPGLAP 

LRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 

LLPRRLPLAFRDATSAPLRKLSVDLIKTYKHINEV 

YYAKKKRRAQQAPPQDSSNKKEKKVLNHGYDD 

DNHDYIVRSGERWLERYEIDSLIGKGSFGQWKA 

YDHQTQELVAIKIIKNKKAFLNQAQIELRLLELM 

NQHDTEMKYYIVHLKRHFMFR>AHLCLVFELLS 

YNLYDLLRNTHFRGVSLNLTRKLAQQLCTALLF 

LATPELSIIHCDLKPENILLCNPKRSAIKIVDFGSS 

CQLGQRIYQYIQSRFYRSPEVLLGTPYDLAIDMW 

SLGCILVElVfflTGEPLFSGS>ffiVCPQEGVDQMNRI 

VEVLGIPPAAMLDQAPKARKYFERLPGGGWTLR 

RTKELRKDYQGPGTRRLQEVLGVQTGGPGGRRA 

GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 

ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 

PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end . 
nucleotide 
location 
corresponding 
to last amino 
acid residue .of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D^Aspartic Acid, 
£-Glutamic Acid, F=Plienylalanine, G=Glycine, H=Hi$tidine, 
Msoleudne, K=Lysine, L»Leucine, M^Methionlne, 
N-Asparagine, P»Prolinc Q^GIutamlnCj R«At^inine, S=^erine, 
T>=Threonine, V=Valinc, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










GPPITDCEMNSPQVPPSQPLRPWAGGDVPHKTH 

QAPASASSLPGTGAQLPPQPRYLGRPPSPTSPPPP 

ELMDVSLVGGPADCSPPHPAPAPQHPAASALRT 

RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 

S 


3893 


A 


68 


258 


PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTVPRPASPTEAGIDPQTFLHTWVSECRD 


3894 


A 


1120 


136 


SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 

AASGSHPEVGSVLQRSSQPHWPNPWPGAGHLPP 

PAGPFPYNPP AGPGA AAGLA* SPPRSSPTPCS VGP 

QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 

AAPAPGGPAPAEPPLGVPPVPAWLLPDSPPLPGT 

HSGPPPAAVSLPPAAAACPVVVPPPLPHHPPDLES 

PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 

LLPLPRPPS*PAa>WKPLHSPVAVAGGSFVAGGSV 

LPAPDLDQPRPSGPPAASPTPGPGVAQPPPGSAVL 

PTVP*APPVSGAAPGRKREW 


3895 


A 


2 


1347 


FGAVSYRPGNGSCWVKVTASSDLSDLISCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEEETQKRRAKESGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVIKELAPQQEGNP/ARSIPHSDIGT 

T*KT*H*RVLLQGNQEKNTRL*LSVER**KKLQQ 

SDYGPKRKSYL*ERPTR*KRYRKQVY*TSA\*LSF 

LPHPHELQQFQAEGKIYECNHVEKSVNHGSSVSP 

PQIISSTIKTHVSNKYGTDFrCSSLLTQEQKSCIRE 

KPYRYIECDKALNHGSHMTVRQVSHSGEKGYKC 

DLCGKVFSQKSNLARHWRVHTGEKPYKCNECD 

RSFSRNSCLALHRRVHTGEKPYKCYECDKVFSR 

NSCLALHQKTfflGEKPYTCKECGQAFSVRSTLTN 

HQVIHSDK 


3896 


A 


202 


498 


MVQSCSAYGCKNRYDKDKPVSFHKFPLTRPSLC 
KEWEAAVRRKNFKPTKYSSICSEHFTPDCFKREC 
NNKLLKENAVPTIFLCTEPHDKKEDLLEPQEQ 


3897 


A 


2 


382 


SHGLSRAPHLSAAPAPALASRPCFSSAPCSQGGG 
GGGPATMIHFILLFSRQGKLRLQKWYITLPDKER 
KKITREIVQHLSRGHRTSSFVDWKELKLVYKRYA 
SLYFCCAIEXNQDNELLTLENVHR 


3898 


A 


718 


305 


SEQEPLLGDTPGSREWDILETEEHYKSRWRSIRIL 
YLTMFLSSVGFSWMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPLIVSI 
LISVAANCLYAYLHIPASHNKYYMLVARGLLGIG 


3899 


A 


24 


718 


FRGRPGIPEREGKGNHSFVEVARVIVVDLHSRLG 
GAMAERKGTAKVDFLKKIEKEIQQKWDTERVFE 
VNASNLEKQTSKGKYFVTFPYPYMNGRLHLGHT 
FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMPIK 
ACADKLKREIELY/GCPPDFPDEEEEEEETSVKTE 
DniKDKAKGKKSKAA/AKAGSSKYQWGIMKSLG 
LSDEEIVKFSEAEHWLDYFNALAIQDLKRMG 


3900 


A 


360 


1 


VPATSSNVSPSSSESSEPDLSSRSSSSDAPSSSPSVP 
SPCSLSLSSPESPLLPTLLSSKSPAGSAGPTCGCPS 
GPGLRATA/PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE 


3901 


A 


193 


345 


GEWAVPPAPGGQGVSIPHGPEPGQGSGVHIAPRQ 
GEGSDRTEPLICPKAAP 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino ' 

Muu rcoitluc ui 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

sequence 


Amino acid sequence (A=Alanine CHCysteine, D'^Aspartic Acid, 
E»Glutamic Acid, F-Phenylalanine, G'^Slycine, H^Histidinc, 
1-Isoleucine, K^Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Frolinc, Q=Glutarainc, R=Arginine, S=Serine, 
T"Tbreoninc V«Valinc, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 

Xsnnccihl^ niij^lMitifl^ ■ncpi*#inn 
\~pudaiuic □udcuiiuc iiiacrtiuii 


3902 


A 


1188 


1389 


NPAARSAAAREGSPALPPPPVS/SSSGLGLLLPLSP 
PGSHAANPALSPRAPHSHYRPRPRCGPRRRPR 


3903 


A 


63 


396 


NNMRNPHLSSNHYLNLARTETVFARMESVKQRI 
LAPGKEGLKNFAGKSLGQIYRVLEKKQDTGETIE 
LTEDGKPL*VPERKAPLCDCTCFGLPRRYIIAIMS 
GLGFCISFG 


3904 


A 


732 


1046 


AMSECPLILYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREVVFGKSEDEHYPLW*VLFGK*YA 
VAPNALMFIRFM*NCIPVPKLP*VMDLK**LQYK 
SR 


3905 


A 


46 


910 


QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 

PFPLPCSTMPGMMEKGPELLGKNRSANGSAKSP 

AGGGGSGASSTNGGLHYSEPESGCSSDDEHDVG 

MRVGAEYQARIPEFDPGATKYTOKDNGGMLVW 

SPYHSIPDAKLDEYIAIAKEKHGYNVEQALGMLF 

WHKHNIEKSLADLPNFTPFPDEWTVEDKVLFEQ 

AFSFHGKSFHORIQQMLPDKTIASLVKYYYSWKK 

TRSRTSLMDRQARKLANRHNQGDSDDDVEETHP 

MDGNDSDYDPKKEAKKEGMS 


3906 


A 


2 


513 


KVCNCCSQELETSFTYVDKNINLEQRNRSSPSAK 
GHNHPGELGWENPNEWSQEAAISLISEEEDDTSS 
EATSSGKSBDYGFISAILFLVTGILLVIISYIVPREV 
TVDPNTVAAREMERLEKESARLGAHLDRCVIAG 
LCLLTLGGVBLSCLLMMSMWKGELYRRNRFAS 


3907 


A 


71 


412 


ILIMSNCLQNFLKITSTRLLCSRLCQQLRSKRKFF 
GTVPISRLHRRVVITGIGLVITLGVGTHLVWDRLI 
GGESGIVSLVGEEYKSIPCSVAAYVPRGSDEGQF 
NEQNFVSKSD 


3908 


A 


77 


746 


LGTLLGWRAPLFSRCLAFHSPFILLNTPKLVKTAE 

LPPDRNYVLGAHPHGIMCTGFLGNFSTESNGFSQ 

LFPGLRPWLAVLAGLFYLPVYRDYIMSFGLCPVS 

RQSLDFILSQPQLGQAVVIMVGGAHEALYSVPGE 

HCLTLQKRKGFVRLALRHGASLVPVYSFGENDIF 

RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 

GLFSATSWGLLPFAVPITTV 


3909 


A 


1 


793 


FRAAGRPAAAMGDIPWGLSSWKASPGKVTEAV 

KEAIDAGYRHFDCAYFYHNEREVGAGIRCKJKE 

GAVRREDLLIATKLWCTCHKKSLVETACRKSLK 

ALKLNYLDLYLIHWPMGFKPPHPEWIMSCSELSF 

CLSHPRVQDLPLDESNMVIPSDTDFLDTWEAME 

DLVITGLVKNIGVSNFNHEQLERLLNKPGLRFKP 

LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 

GSCEGVDLIDNPVIKRIAKEHGKSPAQILI 


3910 


A 


202 


705 


FFTMHRKKVDNRIRILIENGVAERQRSLFVVVGD 

RGKDQWILHHMLSKATVKARPSVLWCYKKEL 

GFSSHRKKRMRQLQKKIKNGTLNIKQDDPFELFI 

AATNIRYCYYNETHKILGNTFGMCVLQDFEALTP 

NLLARTVETVEGGGLVVILLRTMNSLKQLYTVT 

M 


3911 


A 


3 


723 


AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 

RYGADKMAAGGAVAAAPECRLLPYALHKWSSF 

SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 

LERPAIVQNITFGKYEKTHVCNLKKFKVFGGMN 

EE>MmLSSGLKNDYNKETFTLKHKIDEQMFPC 

RFIKIVPLLSWGPSFNFSIWYVELSGIDDPDIVQPC 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
DCDtide 
sequence 


Amino acid sequence (A^Alanine C-Cysteine, D=Aspartic Acid, 
E»Glutamic Acid, F^PIienylalanine, G^GIycine, H^Histidine, 
Msoleucine, Ks=Lysine, L=Leucine, M=]Vfethionine, 
N^Asparagincy pe=Proline, Q»Glutamine, R-Arginine, S=Serine, 
T«Thrconinc, V=Valinc, W=Tryptophan, Y«Tyrosinc, 
X=Unknoiyn, ^^Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LNWYSKYREQEAIRLCLKHFRQHNYTEAFESLQ 
KKT 


3912 


A 


2 


461 


FEKKQLRRPSLFLLGCCSFGIMAPSLWKGLEGIG 

LFALAHAAFSAAQHRSYMRLTEKEDESLPIDIVL 

QTLLAFAVTCYGIVHIAGEFKDMDATSELKNKTF 

DTVRNHPSFYVFNHRGSEYFSGPSDTANSSNQDA 

LSSNTSLKLRKLESLRR 


3913 


A 


362 


20 


APGRPEAKVPERSRBSGSRRVRGPLLQLRPGRTS 
RPASGRGRGGAGGSYGKMRKPDSKIVLLGDMN 
VGKTSLLQRYMERRFPDTVSTVGGAFYLKQWRS 
YNISIWDTAGEAGAA 


3914 


A 


1 


7545. 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTOEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTIffiND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKX)IKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMfflQSAVSKMNP 

GEKEPIHRGTTEVNDDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTWPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHBPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

lEADEGLIIGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKDEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG, 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDM)APPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSHJSSEGFAISSESKNGESA 
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SEQ ID 
NO; 


Method 


Predicted 
beginning 
nucleotide 

iUvoUUIl 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
tocation 
corresDondi nQ 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=PhenyIalanine, G=G)yctn€y H^Histidine, 
I=Isoleucinc, K^Lysinc, l/==Lcucine, M=Mcthionine, 
N=A^nflrflainp P^PrQline. 0=Glutaraine, R=:Areinine. S=5erine. 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=po$sible nucleotide insertion 










MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTWEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLIISTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGWVESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SmRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSS VSSIRYLA A VNTGAIKADDMPPVQ . 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE . 

KPEQNDDDTIKSQE 


3915 


A 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRJCENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSmKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Aianine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G^GIycine, H=>Histidine, 
I=Isoleucine, K=Lysine, L=Leudne, M^IVIethionine, 
N=Asparagine, P»Froline, Q=Glutamine, R^Arginine, S==Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=possible nacleottde deletion, 
\»possible nucleotide insertion 










MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGWSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNroSETVHRMLLSAPSE>ro 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHVVGLNTEKYAETV 

KIJCHKRSPGKVICDISIDVERRNENSEVDTSAGSG 

SAPSVLHQKNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

lEADEGLUGTHSRNNPLHVGAEASECTVFAAAEE 

GGAWTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 

MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDVVTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGWVESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEQNDDDTIKSQE 


3916 


A 


2 


773 


GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRKQRRERTT 
FTRSQLDVLEALFAKTRYPDIFMREEVALKINLPE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of ' 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A="Alanine C=Cysteine, D=Aspartic Acid, 
E>»Glutamic Add, F»Pbenylalanine, G«Giydne, H"Histidine, 
I^Isoleucine, K=Ly5ine, Lr^Leudne, M-Metbionine, 
N=Asparagine, P^Froline, Q^GIutamine, R^Arginine, S=^crine, 
T-Threonine, V=Valine, WaTryptophan, Y-Tyroslnc, 
X'=linlcnown, *»Stop codon, ^possible nucleotide deletion, 
V^possible nudeotide insertion 










SRVQVWFKMlIt/yCCRQQQQSGSGTKSRPAKKK 
SSPVRESSGSESSGQFTPPAVSSSASSSSSASSSSA 
NPAAAAAAGLWAKLPCPLHIFSLCVFIEENRLV 
SGSWARDIRSVEETDKSGYR 


3917 


A 


2 


776 


RNIPGREFRPPGLRRLLKGPHMPREPRGYRTRVP 

ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 

NVQAGGALAPPRHLCGLCSRLHELKPDLSVRAA 

PSRAGASVMALRKELLKSIWYAFTALDVEKSGK 

VSKSQLRVLSHNLYTVLHffHDPVALEEHFRDDD 

DGPVSSQGYMPYLNKYILDKVEEGAFVKEHFDE 

LCWTLTAKKNYRADSNGNSMLSNQDAFRLWCL 

FNFLSEDKYPLIMDPDEGEYLLKRYS 


3918 


A 


10 


318 


WQDLVCLGGSRAQEQKPLQQLWNAILLVAMLL 
CTGLWQAQRQASRQSQRELGGQVDLFKRRVV 
RRLASLKTRRCRLSRAAQGLPDPGAETCAVCLD 
YFCNKQ 


3919 


A 


1 


204 


RVLTAmHTLKENLRKFYKGKKDKPLDLRPKKT 
RAMRRRLNMHEENLKTKKQHRKERLYPLRKYA 
AKA 


3920 


A 


1 


654 


RCCRSFVAPLQEKVVFGLFFLGAILCLSFSWLFHT 

VYCHSEGVSRLFSKLDYSGIALLIMGSFVPWLYY 

SFYCNPQPCFIYLIVICVLGIAAIIVSQWDMFATPQ 

YRGVRAGVFLGLGLSGIIPTLHYVISEGFLKAATI 

GQIGWLMLMASLYITGAALYAARIPERFFPGKCD 

IWFHSHQLFHIFWAGAFVHFHGVSNLQEFRFMI 

GGGCSEEDAL 


3921 


A 


1587 


452 


LERDGCGGEEGGSVRSGAGPDSDPRGASSPPAG 

HRGTAASPRPVAAPSRTPAPPHTRARASPGLPSG 

PAWRRVQWFSRVSGQVSTLMKATVLMRQPGRV 

QEIVGALRKGGGDRLQVISDFDMTLSRFAYNGK 

RCPSSYNDLDNSKIISEECRKELTALLHHYYPIEID 

PHRWKEKLPHMVEWWTKAHNLLCQQKIQKFQI 

AQVVRESNAMLREGYKTFFNTLYHNNIPLFIFSA 

GIGDILEEIIRQMKVFHPNIHIVSNYMDFNEDGFL 

QGFKGQLIHTYNKNSSACENCGYFQQLEGKTNV 

ILLGDSIGDLTMADGVPGVQNILKIGFLNDKVEE 

RRERYMDSYDP/LEKDETLDVVNGLLQHILCQG 

VQLEMQGP 


3922 


A 


2 


164 


GKIYQRAFGGHSLKFGKGVQAHGCCCVADRTG 
HSBLHTSYGRERPAPVHLRQDT 


3923 


A 


2 


3258 


EHATHAYAKLGTRRRHREVTVFVPTWQLKKNR 

RVRESHFLTKLHSLKMLSITPSQLENGmXTYD 

YRFMVKLAEETDGIIVTNEQIHILMNSSKKLMVK 

DRLLPFTFAGNLFMVPDDPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVWKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEILRCLSLHDPPD 

GALDIDLLPGAASPYLGPWDGKAPCQQVLAHL 

AQLTIPSNFTALSFFMGFMDSHRDAIPDYEALVG 

PLHSLLKQKPDWQWDQEHEEAFLALKRALVSAL 

CLMAPNSQLPFRLEVTVSHVALTAILHQEHSGRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTPVVLDLSYASRTTADPEVREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGENRL 

LTPAASMPRFFQVLPPFSDLSTFVCIHMSGYCFYR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotidc 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
Jocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnylaIanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=JLysinc, L^Leucine, M^Methionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R==Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EDEWCAGFGLYVLSPTSPPVSLSFSCSPYTPTYA 
HLAAVACGLERFGQSPLPWFLTHCNWIFSLLWE 
LLPLWRARGFLSSDGAPLPHPSLLSYIISLTSGLSS 
LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 
SLPKDVPAPTVSPHAMGKRPNLLALQLSDSTLAD 
IIARLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 
KGDKKPRVWVVPTQLRRDLIFSVHDIPLGAHQR 
PEETYKKLRLLGWWPGMQEHVKDYCRSCLFCIP 
RNLIGSELKVIESPWPLRSTAPWSNLQIEVVGPVT 
ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 
QVLLQHVFARWGVPVRLEAAQGPQFARHVLVS 
CGLALGAQVASLSRDLQFPCLTSSGAYWEFKRA 
LKEFIFLHGKKWAASLPLLHLAFRASSTDATPFK 
. VLTGGESRLTEPL WAVEMSSANIEGLKMD VFLLQ 
LVGELLELHWRVADKASEKAENRRFKRESQEKE 
WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRLSL 
SLYRIWGFPTPEKLGCIYPSSLMKAFAKSGTPLSF 
KVLEQ 


3924 


A 


1 


1826 


MGSVTVRYFCYGCLFTSATWTVLLFVYFNFSEV , 

TQPLKNWVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFKANKIDDVIDSRVEDPEEGHLKFSSELGMIF 

NERDQELRDLGYQKHAFNMLISDRLGYHRDVPD 

TRNAACKEKFYPPDLPAASWICFYNEAFSALLR 

TVHSVIDRTPAHLLHEIILVDDDSDFDDLKGELDE 

YVQKYLPGKIKVIRNTKREGLIRGRMIGAAHATG 

EVLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

PVIDIISADTLAYSSSPVVRGGFNWGLHFKWDLV 

PLSELGRAEGATAPIKSPTMAGGLFAMNRQYFH 

ELGQYDSGMDIWGGENLEISFRIWMCGGKLFIIP 

CSRVGfflFRKRRPYGSPEGQDTMTHNSLRLAHV 

WLDEYKEQYFSLRPDLKTKSYGNISERVELRKKL 

GCKSFKWYLDNVYPEMQISGSHAKPQQPIFVNR 

GPKRPKVLQRGRLYHLQTNKCLVAQGRPSQKG 

GLVVLKACDYSDPNQIWIYNEEHELVLNSLLCLD 

MSETRSSDPPRLMKCHGSGGSQQWTFGKNNRLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSQQ 

WHLEG 


3925 


A 


5386 


2897 


VRWNSKTECYLSIQTQENFPANLNELVNCIVISSL 

VTTQRKLKAMSLLGSRNQLARAVLNPNPMDFCT 

KDLLTTTSERIIAYLRDFNEDQKKAIETAYAMVK 

HSPSVAKICLIHGPPGTGKSKTIVGLLYRLLTENQ 

RKGHSDENSNAKIKQNRVLVCAPSNAAVDELM 

KKIILEFKEKCKJDKKNPLGNCGDINLVRLGPEKSI 

NSE\a,KFSLDSQVNHRMKKELPSHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQELASKIKEVQGRPQKTQSIIILESHIICCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

ETLTPLIHRCNKLILVGDPKQLPPTVISMKAQEYG 

YDQSMMARFCRLLEENVEHNMISRLPILQLTVQ 

YRMHPDICLFPSNYVYNRNLKTNRQTEAIRCSSD 

WPFQPYLVFDVGDGSERRDNDSYINVQEIKLVM 

EinCLIKDKRKDVSFRNIGUTHYKAQKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVIVTCVRA 

NSIQGSIGFLASLQRLNVTITRAKYSLFILGHLRTL 

MENQHWNQLIQDAQKRGAIIKTCDKNYRHDAV 
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S£Q ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino - 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=!A$partic Add, 
&=GIutamic Acid, F=Flienylalanine, G^Glycine, H°Histidine, 
Msoleudnc, K^^Lysine, L>=Leudne, M^lMetfaionine, 
f^Asparagine, P=Proline, Q=Glutaniine, R^Arginine, S°<erioe, 
T=Threonine, V=Valine, W^Tryptopiian, Y-Tyrosine, 
X=Unluiown, *=Stop codon, /=possibIe nudeotide deletion, 
V°possibie nudeotide insertion 










KILKLKPVLQRSLTHPPTIAPEGSEIPQGGLPSSKL 

DSGFAKTSVAASLYHTPSDSKEITLTVTSKDPERP 

PVHDQLQDPRLLKRMGDEVKGGIFLWDPQPSSPQ 

HPGATPPTGEPGFPVVHQDLSHVQQPAAVVAAL 

SSHKPPVRGEPPAASPEASTCQSKCDDPEEELCH 

RREARAFSEGEQEKCGSETHHTORNSRWDKRTL 

EQEDSSSKKRKLL 


3926 


A 


99 


2S4 


MPREDRATWKSNYFLKHQLLDDYPKRFIVGANN 
VGSKQMQQIRMSLRGKAVVLMGKNTMMR 


3927 


A 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFL1HYYASGENWI 
FGDFMCKFIRFSFHFNLYSSBLFLTCFSIFRYCVIIH 
PMSCFSIHKTRCAVVACAVVWIISLVAVIPMTFLI 
TSTNRTrWSACLDLTSSDELNTIKWYNLILTA\LL 
CLPLVIVTLCYTTIIHTLTHGHAN\DSCLKQKARR 
LTILLL 


3928 


A 


1 


1516 


GEEAVGGGAEGGGFGVGAQGRAGGRGVEAGR 

MRLSKTLVDMDMADYSAALDPAYTTLEFENVQ 

VLTMGNDTSPSEGTNLNAPNSLGVSALCAICGDR 

ATGKHYGASSCDGCKGFFRRSVRKNHMYSCRFS 

RQCVVDKDKRNQCRYCRLKKCFRAGMKKEAV 

QNERDRISTRRSSYEDSSLPSINALLQAEVLSRQIT 

SPVSGINGDIRAKKIASIADVCESMKEQLLVLVE 

WAKYIPGFCELPLDDQGALLRAHAGEHLLLGAT 

KRSMVFKDVLLLGNDYIVPRHCPELAEMSRVSIR 

ILDELVLPFQELQIDDNEYAYLKAIIFFDPDAKGL 

SDPGKIKRLRSQVQVSLEDYINDRQYDSRGRFGE 

LLLLLPTLQSITWQMIEQIQFIKLFGMAKIDNLLQ 

EMLLGGSPSDAPHAHHPLHPHLMQEHMGTNVIV 

ANTMPTHLSNGQMCEWPRPRGQAATPETPQPSP 

PGASGSEPYKLLPGAVATIVKPLSAIPQPTITKQE 

VI 


3929 


A 


1 


2782 


RVLSLESPLEKDPRVLGAQSVPRGRALKGLSPLG 

LDSAFRLFPDPRAGPWNTAVLSSGMEPETALWG 

PDLQGPEQSPNDAHRGAESENEEESPRQESSGEEI 

IMGDPAQSPESKDSTEMSLERSSQDPSVPQNPPTP 

LGHSNPLDHQIPLDPPAPEVVPTPSDWTKACEAS 

WQWGALTTWNSPPWPANEPSLRELVQGRPAG 

AEKPYICNECGKSFSQWSKLLRHQRIHTGERPNT 

CSECGKSFTQSSHLVQHQRTHTGEKPYKCPDCG 

KCFSWSSNLVQHQRTHTGEKPYKCTECEKAFTQ 

STNLJKHQRSHTGEKPYKCGECRRAFYRSSDLIQ 

HQATHTGEKPYKCPECGKRFGQNHNLLKHQKIH 

AGEKPYRCTECGKSFIQSSELTQHQRTHTGEKPY 

ECLECGKSFGHSSTLIKHQRTHLREDPFKCPVCG 

KTFTLSATLLRHQRTHTGERPYKCPECGKSFSVS 

SNLINHQRIHRGERPYICADCGKSFIMSSTLIRHQ 

RIHTGEKPYKCSDCGKSFIRSSHLIQHRRTHTGEK 

PYKCPECGKSFSQSSNLITHVRTHMDENLFVCSD 

CGKAFLEAHELEQHRVIHERGKTPARRAQGDSL 

LGLGDPSLLTPPPGAKPHKCLVCGKGFNDEGIFM 

QHQRIHIGENPYKNADGLIAHAAPKPPQLRSPRL 

PFRGNSYPGAAEGRAEAPGQPLKPPEGQEGFSQR 

RGLLSSKTYICSHCGESFLDRSVLLQHQLTHGNE 

KPFLFPDYRIGLGEGAGPSPFLSGKPFKCPECKQS 

FGLSSELLLHQKVHAGGKSSHKSPELGKSSSVli 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Pbenylalanine, G=Glycine, H^Histidine, 
I=Isoieucine, K=Lysine, L=Leucine, M=Metliionine, 
N=Asparagine, P=Proline, Q^Glutamine, R'^Arginine, S=Serine, 
1=1 nreonine, V= Valine, w— i ryptopnan, Y— iyrosine, 
X»Unknown, *^top codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EHLRSPLGARPYRCSDCRASFLDRVALTRHQETH 

TQEKPPNPEDPPPEAVTLSTDQEGEGETPTPTESS 

SHGEGQNPKTLVEEKPYLCPECGAGFTEVAALLL 

T TT> O/^T TT%^\ wot 

HRSCHPGVSL 


3930 


A 


513 


273 


KTQETHIYISEHIFFPFLQGFGNLPICMAKTDLSLS 
HQPDKKGVPSDFILPISDVRASIGAGFIYPLVGTG 

SRESPLWL 


3931 


A 


16 


305 


KRI©FLSCWPAFTVLGEARGDQVDWSKLYRDT 
GL\nKMSRKPRASSPFSNNOTSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPuTKGSP 


3932 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3933 


A 


1 


1546 


STHASEHWDSALQLAKHLAPDQIPFISKEYAIQLE 

FAGDYVN ALAHYEKGITGDNKEHDEACLAG VA 

QMSIRMGDIRRGVNQALKHPSRVLKRDCGAILE 

NMKQFSEAAQLYEKGLYYDKAASVYIRSKNWA 

KVGDLLPHVSSPKIHLQYAKAKEADGRYKEAVV 

AYENAKQWQSVIRIYLDHLNNPEKAVNIVRETQ 

SLDGAKMVARFFLQLGDYGSAIQFLVMSKCNNE 

AFTLAQQHNKMEIYADnGSEDTTNEDYQSIALY 

FEGEKRYLQAGKFFLLCGQYSRALKHFLKCPSSE 

DNVAlEMAIETVGQAKDELLTNQLroHLLGEND 

GMPKDAKYLFRLYMALKQYREAAQTAIIIAREE 

QSAGNYRNAHDVLFSMYAELKSQKIKIPSEMAT 

NLMILHSYILVKIHVKNGDHMKGARMLIRVANN 

ISKFPSHIVPILTSTVIECHRAGLKNSAFSFAAML 

MRPEYRSKIDAKYKKKIEGMXaiRPDISEIEEAT^ 

CPFCKFLLPESELL 


3934 


A 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCLSVS 

GIGGFLVSLSSRMKLQTLAVSVTALKFWSAYVP 

CQTQDRDALRLTLEQIDLIRRMCASYSELELVTS 

AKALNDTQKLACLIGVEGGHSLDNSLSILRtFYM 

LGVRYLTLTHTCNTPWAESSAKGVHSFYNNISGL 

TDFGEKVVAEMNRLGMMVDLSHVSDAVARRAL 

EVSQAPVIFSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSLFHGELIQWQPIRPMCSTVADHFD 

HIKAVMGSKFIGIGGDYDGAGKYRKKTTCKAPW 

RTSSRMSS 


3935 


A 


1 


883 


HETTPAWQSVLLERGWNKFDKQEQNAEDWNL 

YmTSSFRMTEHNSVKPWQQLNHHPGTTKLTR 

KDCLAKHLKHMRRMYGTSLYQFIPLTFVMPNDY 

TKFVAEYFQERQMLGTKHSYWICKPAELSRGRG 

ILIFSDFKDFIFDDMYIVQKYISNPLLIGRYKCDLR 

lYVCVTGFKPLTIYVYQEGLVRFATEBLFDLSNLQ 

NNYAHLTNSSINKSGASYEKIKEVIGHGCKWTLS 

RFFSYLRSWDVDDLLLWKKIHRMVILTILAIAPS 

V rr AA IN Hi-rr Or UlLiiL/lJri Cr xlxv 1 \j 


3936 


A 


203 


441 


HLAHSLGPLPKHYQYCVRYLYYQVTKDVIKEFA 
DDGVKYLELRSTPRRENATGMTKKTYVESILBGI 
KQSKQENLDIDV 
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SEQ ID NO: 


Position of end of 
Signal in Ainino Acid 
Sequence ' . 


MaxS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


1 


19 


0.930 


0.680 


2 


24 


0.964 


0.863 


3 


21 


0.990 


0.901 


4 


19 


0.981 


0.942 


5 


22 


0.991 


0.928 


6 


21 


0.956 


0.843 


8 


22 


0.913 


0.718 


9 


17 


0.997 


0.969 


11 


19 


0.930 


0.680 


13 


36 


0.983 


0.863 


14 


28 


0.935 


0.839 


15 


21 


0.997 


0.955 


16 


16 


0.983 


0.944 


17 


18 


0.989 


0.884 


19 


49 


0.996 


0.719 


20 


28 


0.972 


0.920 


21 


23 


0.954 


0.905 


22 


46 


0.955 


0.568 


23 


26 


0.942 


0.654 


24 


19 


0.979 


0.941 


25 


34 


0.884 


0.565 


26 


33 


0.934 


0.584 


27 


17 


0.975 


0.914 


28 


18 


0.980 


0.934 


29 


23 


0.928 


0.718 


30 


26 


0.978 


0.885 


32 


20 


0.946 


0.719 


33 


29 


0.933 


0.671 


35 


25 


0.996 


0.920 


36 


26 


0.903 


0.579 


40 


19 


0.981 


0.942 


47 


25 


0.971 


0.909 


53 


22 


0.991 


0.928 


55 


24 


0.960 


0.808 


60 


19 


0.986 


0.967 


78 


22 


0.913 


0.718 


86 


20 


0.883 


0.555 


87 


24 


0.982 


0.889 


88 


17 


0.997 


0.969 


115 


19 


0.930 


0.680 


134 


36 


0.983 


0.863 


136 


17 


0.913 


0.696 


137 


19 


0.958 


0.905 


140 


28 


0.935 


0.839 


143 


32 


0.914 


0.740 


153 


21 


0.997 


0.955 


154 


25 


0.913 


0.583 


155 


29 


0.972 


0.857 


169 


30 


0.977 


0.817 


170 


30 


0.977 


0.819 


171 


30 


0.977 


0.819 


175 


47 


0.926 


0.606 


176 


30 


0.968 


0.872 


177 


22 


0.957 


0.791 


192 


43 , 


0.930 


0.678 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 

Sequence 


MaxS (MAXIMUM 
SCORE) 


fk jf rtk.it 0 X 

Means (Mean Score) 


195 


19 


0.956 


A 0£A 


202 


21 


0.982 


A 0*71 


203 


24 


A ACO 

0.957 


A OTA 

U.67U 


207 


23 


0.954 




224 


A £. 

46 


A ACC 

0.955 


A C/ZO ■ 

U.joo 


225 


26 


A Ai<1 

0.942 


A tiKA 

U.oj4 


228 


A C 

45 


A A^ 1 

0.961 




231 


28 


A AAil 

0.994 


0.9J7 


232 


28 


0.993 


0.896 


234 


19 


A ATA 

0.979 


A A>40 

0.942 


235 


19 


A AIA 

0.979 


A CSA 1 


238 


20 


A AOO 

0.987 


0.943 


244 


23 


A A'^A 

0.929 


A /COO 


250 


34 


0.884 


0.565 


256 


33 


0.934 


A COil 

0.584 


258 


25 


0.934 


0.729 


259 


22 


0.969 


A OT 1 

0.871 


264 


19 


0.952 


0.753 


265 


17 


0.975 


0.914 


266 


17 


0.975 


A Al A 

0.914 


271 


23 


0.974 


A OOA 

0.884 


274 


13 


0,971 


A O'Y A 

0.834 


275 


18 


0.980 


A A'^ A 

0.934 


278 


32 


0.958 


0.668 


280 


24 


0.966 


0.881 


281 


24 


0.966 


0.881 


286 


23 


0.928 


0.718 


291 


35 


0.991 


0.824 


293 


27 


0.956 


0.806 


294 


23 


0.952 


A OAT 

0,827 


301 


26 


0.978 


0.885 


316 


20 


0.946 


0.719 


320 


28 


0.978 


A TT C 


327 


29 


0.933 


A ^T 1 

0.671 


331 


48 


0.903 


A CT 1 

0.571 


345 


25 


0.996 


A AO A 

0.920 


349 


26 


0.903 


A CTA 

0.579 


351 


24 


0.951 


0.876 


352 


18 


A A>1 >l 

0.944 


A T1 ^ 

0.716 


353 


32 


A AAO 

0.992 


A OCil 

0.854 


354 


27 


A A y1 C 

0.945 


A 0 1 T 


355 


16 


A AAO 

0.922 


0.716 


356 


13 


A ACA . 

0.959 


A 01 D 

0,0 18 


357 


23 


0.986 


A OTO 

0,875 


358 


19 


A AAyf 

0.904 


A iTTI 

0.671 


359 


16 


0.988 


A AC 1 

0.951 


360 


15 


0.981 


A A'5 0 

0.938 


361 


18 


t\ t\A A 

0.944 


A T 1 ^ 
O.7IO 


362 


21 


0,984 


A OXQ 


363 


40 


0.979 


0,813 


364 


18 


0.883 


A ZTAO 

0.693 


365 


22 


0.962 


A A AO 

0.908 


366 


22 


0.961 


A OOT 

0.827 


3o7 


A A 

44 


A dA 1 

0.941 


U.0Z4 


368 


20 


0.952 


0.791 


369 


22 


0.949 


0.840 


370 


28 


0.957 


0.682 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 


MeanS (Mean Score) 


3/2 


2o 


u.y /4 


A fiOA 


3 /3 


19 


A OTO 


A 0/l'7 

u.y4/ 


3 /4 


29 


n o^Q 


A 


3 / J 


19 


A A/IA 


A CQ7 


3 // 


23 


n o/jo 

U,90z 


A Q1 A 


3 /o 


31 


u.y /4 


A fiO^ 


379 


2o 


U.9o9 


A 0*30 


3oU 


2/ 


U.945 


A 111'7 
U.oi/ 


3o3 


2/ 


A O/l C 


A C17 


3o4 


oc 


A 009 


A 977 


o o^ 
3oj 


32 


A 

U,9o3 


A 




A A 

44 


A 00>l 

0,924 


A 

U.j04 


367 


2o 


U.9 / 1 


A OO/f 


360 


19 


A OQQ 

yj.yoy 


A C/CO 
U.oOZ 


389 


24 




U.y4 / 


390 


O A 

34 


A A/IO 

0.942 


0.63j 


391 


16 


A AOO 

0.922 


A 01 

O./lo 


394 


1 A 

19 


n AQO 

0.98/ 


A Q7A 

u.y /u 


O AO 

39o 


36 


A OQO 

0.992 


A 0^#J 

O.oOO 


404 


13 


A A<A 

0.959 


A 0 1 0 

U.olo 


At '7 

417 


oo 

23 


A AOiC 

0.986 


A 000 

0.8 /o 


/I O 1 

421 


1 A 

19 


A AAyl 

0.904 


A C/l'\ 
0.0/1 


425 


oo 

28 


A AO 1 

0.971 


A Ol O 
O./l / 


431 


16 


A AOO 

0.988 


A AC 1 

0.951 


452 


1 o 

18 


A r\AA 

0.944 


O./lo 


459 


21 


A AA 1 

0.991 


A AAO 

0.902 


4oo 


O 1 

21 


A AO/1 

0.984 


u.ooy 


47 o 


/I A 

40 


A AOO 

0.9 /9 


A 0 1 '3 

U.O 13 


486 


1 O 

18 


A OO-^ 

O.ooJ 


0.693 


y| A A 

499 


oo 

22 


A A<iO 

0.962 


A DAO 

0.908 


501 


1 A 

19 


A A^O 

0.962 


A 000 

0.877 


CI >f 

514 


A A 

44 


A C\A 1 

0.941 


0.624 


529 


OA 

20 


A A<0 

0.952 


A 001 

0. /91 


533 


O A 

39 


A A1 il 

0.914 


A 01 A 
0. /19 


548 


o o 
28 


A OCT 


A AQO 


561 


oo 

28 


A AO/I 

0.9 /4 


A QO/1 

0.894 


562 


oo 
28 


A AO/f 

0,9./4 


0.893 


564 


1 o 

18 


A A/IA 

0.949 


A 0A< 

0.806 


576 


1 A 

19 


A AOO 

0.972 


A O/IO 

0.94/ 


584 


oo 
29 


0.96o 


A 7CC 
U. /OJ 


co< 
585 


oo 
28 


A dni 
0.9 ID 


A 01 A 

0.810 


591 


1 A 

19 


A A/1 A 

0.949 


A OQO 

0-89/ 


coo 
592 


O A 

24 


A AA1 

0.991 


A OC/t 

0.954 


594 


OA 

20 


A O0< 

0.985 


A OCO 


j95 


OA 


A QOC 


A OCO 


ol2 


Ol 

23 


A A/^O 

0.962 


A OIA 

U.91U 




1 1 
31 


A AO/I 

0.9 /4 


A QQC 

0.895 


021 


1 c 
15 


A OCO 

0-959 


A 70C 


d33 


2o 


A AAO 

0.969 


A O'iO 


<>IA 

04U 


OA 

20 


A A/IA 

0.949 


A Cylo 


</lc 
045 


o< 
25 


A Al 1 
0.91 1 


A OCO 

u. /oy 


064 


oc 
25 


A AAO 

0.992 


A coo 

U.o// 


691 


oo 
32 


A AOO 

0.983 


A QOC 

0.825 


oyo 


AA 


U.7Z4 




700 


19 


0.982 


0.941 


710 


26 


0.971 


0.894 


714 


23 


0.965 


0.907 
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SEQIDNO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


718 


19 


0.989 


0.862 


725 


21 


0.976 


0.851 


728 


33 


0.961 


0.895 


734 


25 


0.963 


0.660 


741 


34 


0.942 


0.635 


744 


19 


0.959 


0.924 


747 


16 


0.922 


0.716 


756 


26 


0.973 


0.864 


161 


22 


0.986 


0.943 


768 


27 


0.916 


0.758 


769 


19 


0.987 


A A'VA 

0.970 


770 


22 


0.981 


A Af> 

0.933 


771 


34 


0.993 


0.893 


773 


20 


0.968 


0.939 


774 


21 


0.971 


0.945 


778 


22 


0.986 


0.943 


779 


32 


0.973 


0.846 


781 


23 


0.950 


0.857 


785 


27 


0.916 


0.758 • 


786 


27 


0.916 


0,758 


788 


22 


0,981 


0.933 


793 


22 


0.986 


0.803 


794 


39 


0.892 


0.654 


797 


27 


0.965 


0.847 


810 


22 


0.981 


0.933 


823 


34 


0.993 


0.893 


825 


17 


0.962 


0.778 


837 


20 


0.968 


0.939 


844 


25 


0.984 


0.951 


845 


17 


0.919 


0.706 


846 


21 


0.971 


0.945 


847 


21 


0.971 


0.945 


890 


22 


0.986 


0.943 


893^ 


24 


0.971 


0.865 


894 


24 


0.971 


0.865 


896 


32 


0.973 


0,846 


899 


31 


0.982 


0.817 


922 


15 


0.882 


0,706 


924 


21 


0.975 


0.948 


925 


21 


0.927 


0.661 


933 


20 


0.967 


0.906 


960 




0.967 


A OA/: 

u.yuo 


967 


38 


0.970 


0.784 


968 


47 


0.970 


0.557 


972 


36 


0.945 


0.775 



TABLES 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, E^Glutamic Acid, F=Phenylalanine, G'^Glycine, 
H=Histidine, I=I$oleucine, K=Lysine, Lr^Leucine, 
M=Methionine, N=Asparagine, P=Pronne, Q^GIutamine, 
R=Arginine, S=Serine, T=Threonine, V=VaIine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible nucleotide 
insertion 


3955 


A 


235 


1272 


GPREVLAASSLADGSEEQVMAVALVRERDLSFPG 
VGDAVVNPTRWHLPAQPEMLYEGGEGRMETLK 



475 



wo 



01/57190 



PCTAJSOl/04098 



SEQ 

m 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D==Aspartic 
Acid, £=Glutamic Acid, F=Phenylalanine, G=Glycine, 
H==Histidine, I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=Glutainine) 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W=Tryptoplian, Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=po$sible nucleotide 
insertion 










DKTLQELEELQNDSEAIDQLALESPEVQDLQLERE 

MALATNRSLAERNLEFQGPLEISRSNLSDRYQELR 

KLVERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

KIEEESEAMAEKFLEGEVPLETFLENFSSMRMLSH 

LRRVRVEKLQEVVRKPRASQELAGDAPPPRSPPP 

V/PPSPPGNTPCG*RAAAATISHASLPFALQPIPQPA 

CGPHCPWSPATGPFPSSVPALLLQRASGPHLPGSP 

AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

Y 


3956 


A 


821 


385 


SICADRTERVGIFFYIPAGTTDEADVTHP*EGHSYL 

SNHAGIQRSSRP/SHYQGEAVHDNCFTADELQLLT 

YQLCHTYVRCTRSVSIPAPAYYAHLVAFRARYHL 

VDKEHDSAEGSHVSGQSNGRDPQALAKAVQfflQ 

DTLRTMYFA 


3957 


A 


4621 


240 


ELISTFKLLLEKKRSEVMKMKKRYEVGLEKLDSA 

SSQVATMQMELEALHPQLKVASKEVDEMMIMIE 

KESVEVAKTEKIVKADETIANEQAMASKAIKDEC 

DADLAGALPILESALAALDTLTAQDITWKSMKSP 

PAGVKLVMEAICILKGIKADKIPDPTGSGKKIEDF 

WGPAKRLLGDMRFLQSLHEYDKDNIPPAYMNIIR 

KNYIPNPDFVPEKIRNASTAAEGLCKWVIAMDSY 

DKVAKIVAPKKIKLAAAEGELKIAMDGLRKKQA 

ALKEVQDKLARLQDTLELNKQKKADLENQVDLC 

SKKLERAEQLIGGLGGEKTRWSHTALELGQLYIN 

LTGDILISSGVVAYLGAFTSTYRQNQTKEWTTLCK 

GRDIPCSDDCSLMGTLGEAVTIRTWNIAGLPSDSF 

SIDNGlilMNARRWPLMroPQSQANKWIKNMEKA 

NSLYVIKLSEPDYVRTLENCIQFGTPVLLENVGEE 

LDPILEPLLLKQTFKQGGSTCIRLGDSHEYAPDFR 

FYITTKLRNPHYLPETSVKVTLLNFMITPEGMQDQ 

LLGIVVAQERPDLEEEKQALILQGAENKRQLKEIE 

DKILEVLSSSEGNILEDETAIKILSSSKALANEISQK 

QEVAEETEKKIDTTRMGYRPIAfflSSILFFSLADLA 

NIEPMYQYSLTWFINLFILSIENSEKSEILAKRLQIL 

KDHFTYSLYVNVCRSLFEKDKLLFSFCLTINLLLH 

ERAINKAEWRFLLTGGIGLDNPYANPCTWLPQKS 

WDEICRLDDLPAFKTIRREFMRLKDGWKKVYDSL 

EPHHEVFPEEWEDKANEFQRMLIIRCLRPDKVIPM 

LQEFIINRLGRAFIEPPPFDLAKAFGDSNCCAPLIFV 

LSPGADPMAALLKFADDQGYGGSKLSSLSLGQGQ 

GPIAMKMLEICAVKEGTWVVLQNCHLATSWMPT 

LEKVCEELSPESTHPDFRMWLTSYPSPNFPVSVLQ 

NGVKMTNEAPKGLRANIIRSYLMDPISDPEFFGSC 

KKPEEFKKLLYGLCFFHALVQERRKFGPLWWNIP 

YEFNETDLRISVQQLHMFLNQYEELPYEALRYMT 

GECNYGGRVTDDWDRRTLRSILNKFFNPELVENS 

DYKFDSSGIYFVPPSGDHKSYIEYTKTLPLTPAPEI 

FGMNANADITKDQSETQLLFDNILLTQSRSAGAG 

AKSSDEVVNEVASDILGKLPNNFDIEAAMRRYPT 

TYTQSMNTVLVQEMGRJmLLKTIRDSCVNIQKA 

nCGLAVMSTDLEEVVSSILNVKIPEMWMGKSYPS 

LKPLGSYVNDFLARLKFLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTIPIDLLGFDYEVMED 

KEYKHPPEDGVFIHGLFLDGASWNRKIKKLAESH 

PKILYDTVPVMWLKPCKRADIPKRPSYVAPLYKT 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last, amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic 
Acid, £=Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=GIutamine, 
R=Arginine, S=Serine, t=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \==possible nucleotide 
insertion 










SERRGVLSTTGHSlWVIA\M'rLPSDQPKEHWIGR 
GVALLCQLNS 


3958 


A 


35 


529 


GADMAKSKNHTTHNQSRKWHRNVIKKPLSQRYK 

SLKGVDPKFLGNMCFTKKHKKKGLKKMQADSA 

KAVSTCAKAIEALVKPKEVKPKIPKGVSCELN*LA 

YIAYPKFWTCACACIAKGLRLCQPKAKAQDQTK 

AQVQIKAQAAAPASVPTQAPKGAQAPTKASG 


3959 


A 


1883 


763 


LLVLLLRTNLLIASSTRISRATLTCSPPGIPVDPRVR 

PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 

QLFPDPEPVRNLQLAPTQGAVFVGFSGGVWRVPR 

ANCSVYESCVDCVLARDPHCAWDPESRTCCLLSA 

PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 

PQIIKEVLAVPNSILELPCPHLSALASYYWSHGPAA 

VPEASSTVYNGSLLLIVQDGVGGLYQCWATENGF 

SYPVISYWVDSQDQTLALDPELAGIPREHVKVPLT 

RVSGGAALAAQQSYWPHFVTVTVLFALVLSGALI 

ELVASPLRALRARGKVQGCETLRPGEKAPLSREQH 

LQSPKECRTSASDVDADNNCLGTEVA 


3960 


A 


1 


481 


SYAAPSLFVKSLYWALAFMAVLLAVSGVVIVVLA 
SRAGARCQQCPPGWVLSEEHCYYFSAEAQA WEA 
SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 
AWRGPQGWHWIDEAPLPPQLLPEDGEDNLDINCG 
ALEEGTLVAANCSTPRPWVCAKGTQ 



TABLE 9 



SEQ ID NO: 


Accession 
Number 


Species 


Description 


Smith 

Waterman 

Score 


% Idenity 


3937 


Y27700 


Homo sapiens 


Human secreted 
protein encoded by 
gene No. 12. 


193 


25 


3938 


AF093097 


Homo sapiens 


putative RNA-binding 
protein Q99 


3881 


84 


3939 


AB012308 


Anthocidaris 
crassispina 


B2HC 


4169 


74 


3940 


U 10248 


Homo sapiens 


ribosomal protein L29 


787 


95 


3941 


Y99418 


Homo sapiens 


Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277, 


4031 


100 


3942 


AL023516 


Gallus gallus 


B locus C type Lectin 


198 


35 



TABLE 10 



SEQH) 
NO: 


Accession No. 


Description 


Results* 


3937 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.168e-l 1 209- 

224 


3942 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 6.400e-ll 37- 

55 



* Results Include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 
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TABLE 11 



SEQID 
NO: 


PFAMName . 


Description 


P-Value 


PFAIW 
Score 


3938 


Piwi 


Piwi domain 


2.6e-150 


512.7 


3940 


Ribosomal L29e 


Ribosomal L29e protein family 


2.3e-19 


77.8 


3941 


Sema 


Sema domain 


4e-181 


615.1 


3942 


lectin c 


Lectin C-type domain 


0.086 


-7.1 



TABLE 12 



SEQID NO: 


Position of end of . 
Signal in Amino Acid 
Sequence 


MaxS (Maximum Score) 


Means (Mean Score) 


3941 


31 


0.985 


0.926 


3942 


21 


0.974 


0.894 


TABLE 13 



SEQID NO: 
of f^ll length 
nucleotide 
sequence 


SEQID 
NO: of full 
length 
peptide 
sequence 


SEQID NO: 
of contig 
nucleotide 
sequence 


SEQ ID NO: 
of contig 
peptide 
sequence 


Priority Docket 
number 

corresponding SEQ 
ID NO: in priority 
application 


SEQ ro NO: in 
USSN 09/496,914 


3937 


3943 


3949. 


3955 


787CIP2G 1 


787 3587 


3938 


3944 


3950 


3956 


787CIP2G 2 


787 3813 


39;39 


3945 


3951 


3957 


787CIP2G 3 


787 4462 


3940 


3946 


3952 


3958 


787CIP2G 4 


787 4887 


3941 


3947 . 


3953 


3959 


787CIP2G 5 


787 5794 


3942 


3948 


3954 


3960 


787CIP2G 6 


787 8743 



TABLE 14 



TISSUE ORIGIN 


LIBRARY/ 
RNA SOURCE 


HYSEQ LIBRARY 
NAME 


SEQIDNOS: 


adult brain 


GIBCO 


ABD003 


3940 


adult brain 


Clontech 


ABR006 


3940 


adult brain 


Invitrogen 


ABR014 


3940 


cultured preadipocytes 


Strategene 


ADPOOl 


3937 


adult heart 


GIBCO 


AHROOl 


3940 


adult kidney 


GIBCO 


AKDOOl 


3940 


adult lung 


GIBCO 


ALGOOl 


3940 


young liver 


GIBCO 


ALVOOl 


3940 


adult ovary 


Invitrogen 


AOVOOl 


3938, 3940-3941 


adult spleen 


GIBCO 


ASPOOl 


3940-3941 


testis 


GIBCO 


ATSOOl 


3940 


bone mairow 


Clontech 


BMDOOl 


3938, 3940 


bone marrow 


Clontech 


BMD004 


3940 


adult cervix 


BioChain 


CVXOOl 


3940 


endothelial cells 


Strategene 


EDTOOl 


3940 


fetal brain 


Clontech 


FBR006 


3940 


fetal brain 


Invitrogen 


FBT002 


3940-3941 


fetal heart 


Invitrogen 


FHROOl 


3940 


fetal kidney 


Clontech 


FKDOOl 


3940 


fetal kidney 


Clontech 


FKD002 


3940 
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TISSUE ORIGIN 


LIBRARY/ 
RNA SOURCE 


HYSEQ LIBRARY 
NAME 


SEQ ID NOS: 


fetal liver-spleen 


Columbia 
University 


FLSOOl 


3937, 3940 


fetal liver-spleen 


Columbia 
University 


FLS002 


3938, 3941 


fetal liver-spleen 


Columbia 

University 


FLS003 


3940 


fetal liver 


Clontech 


FLV004 


3940 


fetal skin 


Invitrogen 


FSKOOl 


3940-3942 


fetal spleen 


BioCham 


FSPOOl 


3940 


fetal brain 


GIBCO 


HFBOOl 


3937, 3940-3941 


infant brain 


Columbia 
University 


IB2002 


3937, 3939,3941 


leukocyte 


GIBCO 


LUCOOl 


3940-3941 


leukocyte 


Clontech 


LUC003 


3940-3941 


melanoma from cell line ATCC 
#CRL 1424 


Clontech 


MEL004 


3940 


mammary gland 


Invitrogen 


MMGOOl 


3937, 3940-3941 


neuronal cells 


Strategene 


NTUOOl 


3937, 3942 


prostate 


Clontech 


PRTOOl 


3938 


rectum 


Invitrogen 


RECOOl 


3940 


salivary gland 


Clontech 


SALs03 


3941 


small intestine 


Clontech 


SINOOl 


3940 


skeletal muscle 


Clontech 


SKMOOl 


3940 


spinal cord 


Clontech 


SPCOOl 


3940 


thymus 


Clontech 


THMc02 


3938 


thyroid gland 


Clontech 


THROOl 


3942 


uterus 


Clontech 


UTROOl 


3940 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a full length protein 
coding portion of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% seqiience identity with the polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically, engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consistmg of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 
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11. A composition comprising the polypeptide of claim 1 0 and a carrier. 



12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide, 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forins a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detepting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fromm 
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridi2dng under stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960, the mature protein portion thereof, or the active domam thereof. 

21 . The polypeptide of claim 20 wherem the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
infomiation of at least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherem the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 

27. A method of treatment comprising administering to a manmialian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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